Electronics & Programming

develissimo

Open Source electronics development and programming

  • You are not logged in.
  • Root
  • » Django
  • » Error with database encoding in UTF8 [RSS Feed]

#1 March 26, 2008 15:10:40

s.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

Error with database encoding in UTF8


I have a little problem. My data base is a mysql ENGINE=MyISAM
CHARSET=utf8. I'm using Django revision: 6411


mysql> select id,name from core_category;
+----+----------------------+
| id | name |
+----+----------------------+
| 1 | a�os ha que no te v� |
| 2 | ámigo que onda!!! |
| 3 | La lúna!!! |
+----+----------------------+
3 rows in set (0.04 sec)
The first value was insert by the django admin console and is is
latin1, The bd see it wrong.
The second value was insert by a simple direct insert in the bd.
And the third was insert by the django shell (python manage.py shell)
encoding in utf8 and is OK!!!
See it yourself:

>>> a=pymy.connect(host='localhost', db='test',user='sabri',passwd='sabri')
>>> c=a.cursor()
>>> c.execute('select id,name from core_category')
>>> d=c.fetchall()
>>> d
((1L, 'a\xf1os ha que no te v\xed'), (2L, '\xc3\xa1migo que onda!!!'),
(3L, 'La l\xc3\xbana!!!'))
>>> name=d
>>> name.decode('utf8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "encodings/utf_8.py", line 16, in decode
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1-4:
invalid data
>>> print name.decode('latin1')
años ha que no te ví
>>> name2=d
>>> print name2.decode('utf8')
ámigo que onda!!!
>>> name2=d
>>> print name2.decode('utf8')
La lúna!!!

When i see my data through the django admin console the only one which
i see ok is the first one. See it:
años ha que no te ví
La lúna!!!
ámigo que onda!!!

Thing is than django takes data as latin1 and save it in that
encoding. When show it, (my utf8 encoding data) do it in a bizarre
way, because interpret it in latin1.
Why do this? how can i configure it?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to
For more options, visit this group athttp://groups.google.com/group/django-users?hl=en-~----------~----~----~----~------~----~------~--~---

Offline

#2 March 26, 2008 15:10:56

s.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

Error with database encoding in UTF8


I have a little problem. My data base is a mysql ENGINE=MyISAM
CHARSET=utf8. I'm using Django revision: 6411


mysql> select id,name from core_category;
+----+----------------------+
| id | name |
+----+----------------------+
| 1 | a�os ha que no te v� |
| 2 | ámigo que onda!!! |
| 3 | La lúna!!! |
+----+----------------------+
3 rows in set (0.04 sec)
The first value was insert by the django admin console and is is
latin1, The bd see it wrong.
The second value was insert by a simple direct insert in the bd.
And the third was insert by the django shell (python manage.py shell)
encoding in utf8 and is OK!!!
See it yourself:

>>> a=pymy.connect(host='localhost', db='test',user='sabri',passwd='sabri')
>>> c=a.cursor()
>>> c.execute('select id,name from core_category')
>>> d=c.fetchall()
>>> d
((1L, 'a\xf1os ha que no te v\xed'), (2L, '\xc3\xa1migo que onda!!!'),
(3L, 'La l\xc3\xbana!!!'))
>>> name=d
>>> name.decode('utf8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "encodings/utf_8.py", line 16, in decode
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1-4:
invalid data
>>> print name.decode('latin1')
años ha que no te ví
>>> name2=d
>>> print name2.decode('utf8')
ámigo que onda!!!
>>> name2=d
>>> print name2.decode('utf8')
La lúna!!!

When i see my data through the django admin console the only one which
i see ok is the first one. See it:
años ha que no te ví
La lúna!!!
ámigo que onda!!!

Thing is than django takes data as latin1 and save it in that
encoding. When show it, (my utf8 encoding data) do it in a bizarre
way, because interpret it in latin1.
Why do this? how can i configure it?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to
For more options, visit this group athttp://groups.google.com/group/django-users?hl=en-~----------~----~----~----~------~----~------~--~---

Offline

#3 March 26, 2008 19:11:00

Karen T.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

Error with database encoding in UTF8


On Wed, Mar 26, 2008 at 10:10 AM, sabrina.miller <>
wrote:

> I have a little problem. My data base is a mysql ENGINE=MyISAM
> CHARSET=utf8. I'm using Django revision: 6411
>


> mysql> select id,name from core_category;
> +----+----------------------+
> | id | name |
> +----+----------------------+
> | 1 | a�os ha que no te v� |
> | 2 | ámigo que onda!!! |
> | 3 | La lúna!!! |
> +----+----------------------+
> 3 rows in set (0.04 sec)


Note what you see here when you run mysql is dependent on a couple of
things:
-the character encoding of your terminal (I'm guessing yous is utf-8)
-the mysql connection characteristics, which default to latin1. That is,
mysql defaults to assuming incoming data from the client is in latin1, and
defaults to sending back latin1 results. So even if your table has charset
utf8, the mysql command is going to translate it to latin1 for display here
unless you issue a command like 'set names utf8'. For full details on this
you may want to check out:http://dev.mysql.com/doc/refman/5.0/en/charset-connection.htmlAssuming that your terminal character encoding is utf-8 and you have not
issued a 'set names utf8' from mysql, the entries that look wrong above are
actually the last two, not the first one. I believe it is the methods by
which the second two were inserted that are causing the problem here,
because you are winding up with mysql supplying supposedly 'latin1' data
that looks correct when it is assumed to be utf8.

The first value was insert by the django admin console and is is
> latin1, The bd see it wrong.


Django doesn't have any latin1 defaults, so it seems unlikely Django turned
utf8 data into latin1. It is more likely MySQL turned the utf8 supplied by
Django into latin1 for storage in a table with CHARSET latin1, or for
sending over a connection that it thinks is expecting latin1 (the defaults).

As an aside, I don't know what you mean by "the bd" here?


> The second value was insert by a simple direct insert in the bd.


Which means I also don't know what this means, exactly. But it's easy
enough to directly insert utf8 data into a latin1 charset table via the
mysql command, especially since MySQL defaults to thinking data supplied by
the client is in latin1 charset. You need to issue the command 'set names
utf8' if you want to supply utf-8 encoded data.

And the third was insert by the django shell (python manage.py shell)
> encoding in utf8 and is OK!!!


More details on how this one was done would be helpful in understanding what
happened here. Django should have issued the 'set names utf8' for you, so I
am surprised this one is coming out the same as the 'direct insert' case.

See it yourself:
>
> >>> a=pymy.connect(host='localhost',
> db='test',user='sabri',passwd='sabri')
> >>> c=a.cursor()
> >>> c.execute('select id,name from core_category')
> >>> d=c.fetchall()
> >>> d
> ((1L, 'a\xf1os ha que no te v\xed'), (2L, '\xc3\xa1migo que onda!!!'),
> (3L, 'La l\xc3\xbana!!!'))
> >>> name=d
> >>> name.decode('utf8')
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "encodings/utf_8.py", line 16, in decode
> UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1-4:
> invalid data
> >>> print name.decode('latin1')
> años ha que no te ví
> >>> name2=d
> >>> print name2.decode('utf8')
> ámigo que onda!!!
> >>> name2=d
> >>> print name2.decode('utf8')
> La lúna!!!
>

Again absent your issuing a 'set names utf8' on the connection, MySQL is
going to send back latin1, regardless of the table encoding. Which makes
the 2nd two results the odd looking ones.

When i see my data through the django admin console the only one which
> i see ok is the first one. See it:
> años ha que no te ví
> La lúna!!!
> ámigo que onda!!!
>
> Thing is than django takes data as latin1 and save it in that
> encoding. When show it, (my utf8 encoding data) do it in a bizarre
> way, because interpret it in latin1.


It's MySQL that has the default of latin1 everywhere, not Django. You need
to be careful when not using Django and supplying utf8 data to MySQL that
you have told MySQL that the data is utf8-encoded. I believe there is a way
(described in the MySQL doc page I cited above) to globally change your
MySQL config so it will expect/supply utf8 instead of latin1, so you might
want to look into that.

Karen

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to
For more options, visit this group athttp://groups.google.com/group/django-users?hl=en-~----------~----~----~----~------~----~------~--~---

Offline

#4 March 27, 2008 09:00:45

Peter M.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

Error with database encoding in UTF8


On 3/26/08, Karen Tracey <> wrote:

> you have told MySQL that the data is utf8-encoded. I believe there is a way
> (described in the MySQL doc page I cited above) to globally change your
> MySQL config so it will expect/supply utf8 instead of latin1, so you might
> want to look into that.

We use this:

DROP DATABASE xxxxxx;
SET storage_engine=INNODB;
CREATE DATABASE xxxxxx DEFAULT CHARACTER SET utf8 COLLATE utf8_xxxxxx_ci;

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to
For more options, visit this group athttp://groups.google.com/group/django-users?hl=en-~----------~----~----~----~------~----~------~--~---

Offline

  • Root
  • » Django
  • » Error with database encoding in UTF8 [RSS Feed]

Board footer

Moderator control

Enjoy the 11th of December
PoweredBy

The Forums are managed by develissimo stuff members, if you find any issues or misplaced content please help us to fix it. Thank you! Tell us via Contact Options
Leave a Message
Welcome to Develissimo Live Support