Path: news.netfront.net!news.unit0.net!feeder.erje.net!1.eu.feeder.erje.net!newsfeed.datemas.de!weretis.net!feeder1.news.weretis.net!news.solani.org!.POSTED!not-for-mail
From: Thomas 'PointedEars' Lahn <PointedEars@web.de>
Newsgroups: comp.databases.mysql
Subject: Re: mysql lenght() and char_length not working for longer texts
Date: Tue, 07 Jul 2015 12:09:37 +0200
Organization: PointedEars Software (PES)
Lines: 30
Message-ID: <2401882.BDR5A6XEUG@PointedEars.de>
References: <6dd880fe-c726-4913-b05e-d06bc0e42d4c@googlegroups.com> <mn3qj0$fse$1@dont-email.me> <varchar-length-20150705231630@ram.dialup.fu-berlin.de> <mnes4u$u7l$1@dont-email.me> <4939682.h9gfqvCFYZ@PointedEars.de> <fhds6c-cjj.ln1@xl.homelinux.org> <3594764.8s91zndXLF@PointedEars.de>
Reply-To: Thomas 'PointedEars' Lahn <usenet@PointedEars.de>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8Bit
X-Trace: solani.org 1436263961 11163 eJwFwYERgDAMAsCVbEKwHcfCsf8I/k9zUS84xGSiukh9tErYxA5jHMtPY3wNQDeNddyo/C4yEZI= (7 Jul 2015 10:12:41 GMT)
X-Complaints-To: abuse@news.solani.org
NNTP-Posting-Date: Tue, 7 Jul 2015 10:12:41 +0000 (UTC)
User-Agent: KNode/4.14.2
X-User-ID: eJwFwYEBgDAIA7CXWKFQzpEp/59gQs+TtyKZweVulr66AtKq18fCGpqiWzxH6+YAzzZY7+XMimq4BqPrPx+tFCQ=
Cancel-Lock: sha1:0R0XkOm2v5Zvdrg/Mzc84wC8ytA=
X-NNTP-Posting-Host: eJwVxsERACEIBLCWRGCBcg50+y/BubziCsGEwWFOZyLDdyrRc9myustgUdp++U2cP8JV+0gbHxvxEW4=
Xref: news.netfront.net comp.databases.mysql:2461

Thomas 'PointedEars' Lahn wrote:

> Axel Schwenke wrote:
>> Thomas 'PointedEars' Lahn <PointedEars@web.de> wrote:
>>> Lennart Jonsson wrote:
>>>> Assuming this is true, there are characters that are represented with
>>>> more than two bytes so just doubling the space wont suffice,
>>> It will not.  In UTF-8, a Unicode character can be encoded with up to 6
>>> 8-bit bytes.
>> Again: the context is MySQL. Here a column with CHARACTER SET utf8 can
>> use at most 3 bytes per character because MySQL supports only
>> characters from the BMP (Unicode U+0 ... U+FFFF) […]
> 
> This is confirmed by the manual.  It also confirms that MySQL can support
> characters beyond the BMP now with using different "character sets".  (A
> pity that you did not care to substantiate your statements with a
> reference.)

I see now that I had overlooked that part while trimming.  You mentioned the 
other "character sets" and did provide a reference, albeit an outdated one.

One wonders, though, why you have not contradicted Lennart as emphatically, 
who has claimed the possibility of four-byte code sequences in the context 
of MySQL’s "utf8".

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
