Electronics & Programming

develissimo

Open Source electronics development and programming

  • You are not logged in.
  • Root
  • » Django
  • » Django 1.2.3, Oracle with Character Set WE8MSWIN1252 - 'utf8' codec can't decode bytes [RSS Feed]

#1 Dec. 1, 2010 03:31:32

Anurag C.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

Django 1.2.3, Oracle with Character Set WE8MSWIN1252 - 'utf8' codec can't decode bytes


Hi All,

On Oracle 10.2 with Character-Set set to WE8MSWIN1252,

When using Django, I try to select a Oracle row which contains a field with
value as 'Páginas', i encounter the following error "'utf8' codec can't
decode bytes "

Here is the trace from the python command prompt.

>>> tmlist = TerminologyMap.objects.filter(id=206)
>>> tmlist
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/site-packages/django/db/models/query.py", line
68, in __repr__
data = list(self)
File "/usr/lib/python2.5/site-packages/django/db/models/query.py", line
83, in __len__
self._result_cache.extend(list(self._iter))
File "/usr/lib/python2.5/site-packages/django/db/models/query.py", line
269, in iterator
for row in compiler.results_iter():
File "/usr/lib/python2.5/site-packages/django/db/models/sql/compiler.py",
line 672, in results_iter
for rows in self.execute_sql(MULTI):
File "/usr/lib/python2.5/site-packages/django/db/models/sql/compiler.py",
line 741, in <lambda>
result = iter((lambda: cursor.fetchmany(GET_ITERATOR_CHUNK_SIZE)),
File "/usr/lib/python2.5/site-packages/django/db/backends/oracle/base.py",
line 552, in fetchmany
for r in self.cursor.fetchmany(size)])
File "/usr/lib/python2.5/site-packages/django/db/backends/oracle/base.py",
line 625, in _rowfactory
value = to_unicode(value)
File "/usr/lib/python2.5/site-packages/django/db/backends/oracle/base.py",
line 636, in to_unicode
return force_unicode(s)
File "/usr/lib/python2.5/site-packages/django/utils/encoding.py", line 88,
in force_unicode
raise DjangoUnicodeDecodeError(s, *e.args)
django.utils.encoding.DjangoUnicodeDecodeError: 'utf8' codec can't decode
bytes in position 1-3: invalid data. You passed in 'P\xe1ginas' (<type
'str'>)

Please let me know if I am doing something wrong here or if there is already
a solution available for this encoding problem.

Regards,
Anurag

--
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to
django-users+unsubscr...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/django-users?hl=en.

Offline

#2 Dec. 1, 2010 17:21:30

I.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

Django 1.2.3, Oracle with Character Set WE8MSWIN1252 - 'utf8' codec can't decode bytes


On Nov 30, 8:31 pm, Anurag Chourasia <anurag.choura...@gmail.com>
wrote:
> On Oracle 10.2 with Character-Set set to WE8MSWIN1252,
>
> When using Django, I try to select a Oracle row which contains a field with
> value as 'Páginas', i encounter the following error "'utf8' codec can't
> decode bytes "

The NLS_LANG setting used by Django should guarantee that the data
comes back as UTF-8 regardless of the database character set.

What version of cx_Oracle are you using?
Is the column type VARCHAR2 or NVARCHAR2?
What do you get if you try the following, substituting the appropriate
values?

$ export NLS_LANG=.UTF8
$ python
>>> import cx_Oracle
>>> conn = cx_Oracle.connect('username/passw...@dsn')
>>> cursor = conn.cursor()
>>> cursor.execute("SELECT PROBLEM_COLUMN FROM TERMINOLOGY_MAP WHERE ID = 206")
>>> print repr(cursor.fetchone())

Thanks,
Ian

--
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to
django-users+unsubscr...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/django-users?hl=en.

Offline

#3 Dec. 2, 2010 02:58:39

Anurag C.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

Django 1.2.3, Oracle with Character Set WE8MSWIN1252 - 'utf8' codec can't decode bytes


Hi Ian,

Thanks for the response.

With cx_Oracle(version 5.0.3), the retrieval of that field value works fine
as in my original email.

It's only when i directly use the Django models way of accessing that it
fails.

Below two examples will make it more clear.

This is using Django models and it fails
==============================
>>> TerminologyMap.objects.filter(id=316)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/site-packages/django/db/models/query.py", line
68, in __repr__
data = list(self)
File "/usr/lib/python2.5/site-packages/django/db/models/query.py", line
83, in __len__
self._result_cache.extend(list(self._iter))
File "/usr/lib/python2.5/site-packages/django/db/models/query.py", line
269, in iterator
for row in compiler.results_iter():
File "/usr/lib/python2.5/site-packages/django/db/models/sql/compiler.py",
line 672, in results_iter
for rows in self.execute_sql(MULTI):
File "/usr/lib/python2.5/site-packages/django/db/models/sql/compiler.py",
line 741, in <lambda>
result = iter((lambda: cursor.fetchmany(GET_ITERATOR_CHUNK_SIZE)),
File "/usr/lib/python2.5/site-packages/django/db/backends/oracle/base.py",
line 552, in fetchmany
for r in self.cursor.fetchmany(size)])
File "/usr/lib/python2.5/site-packages/django/db/backends/oracle/base.py",
line 625, in _rowfactory
value = to_unicode(value)
File "/usr/lib/python2.5/site-packages/django/db/backends/oracle/base.py",
line 636, in to_unicode
return force_unicode(s)
File "/usr/lib/python2.5/site-packages/django/utils/encoding.py", line 88,
in force_unicode
raise DjangoUnicodeDecodeError(s, *e.args)
django.utils.encoding.DjangoUnicodeDecodeError: 'utf8' codec can't decode
bytes in position 22-24: invalid
data. You passed in 'Registro guardado con \xe9xito' (<type 'str'>)

This is using cx_Oracle and it works fine
===============================
>>> cx_Oracle.version
'5.0.3'
>>> cursor.execute("select to_term from terminology_map where id=316")
>>> cursor.fetchone()
'Registro guardado con \xe9xito'

Regards,
Anurag

On Wed, Dec 1, 2010 at 10:51 PM, Ian <ian.g.ke...@gmail.com> wrote:

> On Nov 30, 8:31 pm, Anurag Chourasia <anurag.choura...@gmail.com>
> wrote:
> > On Oracle 10.2 with Character-Set set to WE8MSWIN1252,
> >
> > When using Django, I try to select a Oracle row which contains a field
> with
> > value as 'Páginas', i encounter the following error "'utf8' codec can't
> > decode bytes "
>
> The NLS_LANG setting used by Django should guarantee that the data
> comes back as UTF-8 regardless of the database character set.
>
> What version of cx_Oracle are you using?
> Is the column type VARCHAR2 or NVARCHAR2?
> What do you get if you try the following, substituting the appropriate
> values?
>
> $ export NLS_LANG=.UTF8
> $ python
> >>> import cx_Oracle
> >>> conn = cx_Oracle.connect('username/passw...@dsn')
> >>> cursor = conn.cursor()
> >>> cursor.execute("SELECT PROBLEM_COLUMN FROM TERMINOLOGY_MAP WHERE ID =
> 206")
> >>> print repr(cursor.fetchone())
>
> Thanks,
> Ian
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django users" group.
> To post to this group, send email to django-us...@googlegroups.com.
> To unsubscribe from this group, send email to
> django-users+unsubscr...@googlegroups.com<django-users%2bunsubscr...@googlegroups.com>
> .
> For more options, visit this group at
>http://groups.google.com/group/django-users?hl=en.
>
>

--
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to
django-users+unsubscr...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/django-users?hl=en.

Offline

#4 Dec. 2, 2010 07:50:41

Ian K.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

Django 1.2.3, Oracle with Character Set WE8MSWIN1252 - 'utf8' codec can't decode bytes


On Wed, Dec 1, 2010 at 7:58 PM, Anurag Chourasia
<anurag.choura...@gmail.com> wrote:

> This is using cx_Oracle and it works fine
> ===============================
>>>> cx_Oracle.version
> '5.0.3'
>>>> cursor.execute("select to_term from terminology_map where id=316")
>>>> cursor.fetchone()
> 'Registro guardado con \xe9xito'

It's not clear to me which setting you used here. Was this using the
.UTF8 NLS_LANG as I requested? If so, then in fact this is also
coming back incorrectly, because that is not UTF-8. If not, then
we're comparing apples to oranges, since Django uses the .UTF8
setting.

Thanks,
Ian

--
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to
django-users+unsubscr...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/django-users?hl=en.

Offline

#5 Dec. 2, 2010 13:43:50

Anurag C.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

Django 1.2.3, Oracle with Character Set WE8MSWIN1252 - 'utf8' codec can't decode bytes


Hi Ian,

Yes.....I set the NLS_LANG in my shell to UTF8 before trying this.

Query using Django model still fails (direct query using cx_Oracle works
fine)

Regards,
Anurag

On Thu, Dec 2, 2010 at 1:20 PM, Ian Kelly <ian.g.ke...@gmail.com> wrote:

> On Wed, Dec 1, 2010 at 7:58 PM, Anurag Chourasia
> <anurag.choura...@gmail.com> wrote:
>
> > This is using cx_Oracle and it works fine
> > ===============================
> >>>> cx_Oracle.version
> > '5.0.3'
> >>>> cursor.execute("select to_term from terminology_map where id=316")
> >>>> cursor.fetchone()
> > 'Registro guardado con \xe9xito'
>
> It's not clear to me which setting you used here. Was this using the
> .UTF8 NLS_LANG as I requested? If so, then in fact this is also
> coming back incorrectly, because that is not UTF-8. If not, then
> we're comparing apples to oranges, since Django uses the .UTF8
> setting.
>
> Thanks,
> Ian
>

--
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to
django-users+unsubscr...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/django-users?hl=en.

Offline

#6 Dec. 2, 2010 18:37:04

Ian K.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

Django 1.2.3, Oracle with Character Set WE8MSWIN1252 - 'utf8' codec can't decode bytes


On Thu, Dec 2, 2010 at 6:43 AM, Anurag Chourasia
<anurag.choura...@gmail.com> wrote:
> Hi Ian,
> Yes.....I set the NLS_LANG in my shell to UTF8 before trying this.
> Query using Django model still fails (direct query using cx_Oracle works
> fine)
> Regards,
> Anurag

Okay, so it would appear that the client encoding is not being honored
by Oracle for some reason. Just to verify that Django is setting it
correctly in the first place, would you please try the following in a
Django shell and let me know what you get?

>>> import os
>>> print os.environ
>>> from django.db import connection
>>> connection.cursor() # Initialize the connection
>>> print os.environ
>>> print connection.connection.encoding
>>> print connection.connection.nencoding

If everything is correct then the second NLS_LANG should be ".UTF8"
and both encodings should be "UTF-8". If that is the case then I
think your next step should be to try the cx_Oracle mailing list.
Perhaps Anthony or somebody else there will have some idea why
cx_Oracle or OCI are returning strings with the wrong encoding.

Otherwise, we will need to figure out why the client encoding is not
being set correctly by Django.

Ian

--
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to
django-users+unsubscr...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/django-users?hl=en.

Offline

#7 Dec. 2, 2010 20:22:06

Anurag C.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

Django 1.2.3, Oracle with Character Set WE8MSWIN1252 - 'utf8' codec can't decode bytes


Hi Ian,

Here is the information requested by you.

$ python
Python 2.5.2 (r252:60911, Dec 2 2008, 09:26:14)
on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> print os.environ
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/UserDict.py", line 22, in __getitem__
raise KeyError(key)
KeyError: 'NLS_LANG'
>>> from django.db import connection
>>> connection.cursor() # Initialize the connection
>>> print os.environ
.UTF8
>>> print connection.connection.encoding
WINDOWS-1252
>>> print connection.connection.nencoding
WINDOWS-1252
>>>

Regards,
Anurag

On Fri, Dec 3, 2010 at 12:06 AM, Ian Kelly <ian.g.ke...@gmail.com> wrote:

> On Thu, Dec 2, 2010 at 6:43 AM, Anurag Chourasia
> <anurag.choura...@gmail.com> wrote:
> > Hi Ian,
> > Yes.....I set the NLS_LANG in my shell to UTF8 before trying this.
> > Query using Django model still fails (direct query using cx_Oracle works
> > fine)
> > Regards,
> > Anurag
>
> Okay, so it would appear that the client encoding is not being honored
> by Oracle for some reason. Just to verify that Django is setting it
> correctly in the first place, would you please try the following in a
> Django shell and let me know what you get?
>
> >>> import os
> >>> print os.environ
> >>> from django.db import connection
> >>> connection.cursor() # Initialize the connection
> >>> print os.environ
> >>> print connection.connection.encoding
> >>> print connection.connection.nencoding
>
> If everything is correct then the second NLS_LANG should be ".UTF8"
> and both encodings should be "UTF-8". If that is the case then I
> think your next step should be to try the cx_Oracle mailing list.
> Perhaps Anthony or somebody else there will have some idea why
> cx_Oracle or OCI are returning strings with the wrong encoding.
>
> Otherwise, we will need to figure out why the client encoding is not
> being set correctly by Django.
>
> Ian
>

--
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to
django-users+unsubscr...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/django-users?hl=en.

Offline

#8 Dec. 2, 2010 22:24:19

Ian K.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

Django 1.2.3, Oracle with Character Set WE8MSWIN1252 - 'utf8' codec can't decode bytes


On Thu, Dec 2, 2010 at 1:21 PM, Anurag Chourasia
<anurag.choura...@gmail.com> wrote:
> Hi Ian,
> Here is the information requested by you.
> $ python
> Python 2.5.2 (r252:60911, Dec  2 2008, 09:26:14)
> on cygwin
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import os
>>>> print os.environ
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/usr/lib/python2.5/UserDict.py", line 22, in __getitem__
>     raise KeyError(key)
> KeyError: 'NLS_LANG'
>>>> from django.db import connection
>>>> connection.cursor()  # Initialize the connection
>>>> print os.environ
> .UTF8
>>>> print connection.connection.encoding
> WINDOWS-1252
>>>> print connection.connection.nencoding
> WINDOWS-1252

Weird. From what I can tell, this seems to have something to do with
Cygwin, or at least I'm able to replicate it in that environment.
Setting NLS_LANG in or out of process and changing the registry key
all have no effect.

Is there some reason you need to use Cygwin for this? Perhaps you
would have more luck with the regular win32 python.

Ian

--
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to
django-users+unsubscr...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/django-users?hl=en.

Offline

#9 Dec. 2, 2010 23:18:08

Ian K.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

Django 1.2.3, Oracle with Character Set WE8MSWIN1252 - 'utf8' codec can't decode bytes


On Thu, Dec 2, 2010 at 3:23 PM, Ian Kelly <ian.g.ke...@gmail.com> wrote:
> Weird.  From what I can tell, this seems to have something to do with
> Cygwin, or at least I'm able to replicate it in that environment.
> Setting NLS_LANG in or out of process and changing the registry key
> all have no effect.

The actual problem is described here:http://rubyforge.org/forum/forum.php?thread_id=6826&forum_id=1078and from the cx-oracle-users mailing list comes this suggestion:

>>> import ctypes
>>> ctypes.CDLL('kernel32').SetEnvironmentVariableA('NLS_LANG', '.UTF8')

I've tried it, and it works. I suggest patching the above into your
django/db/backends/oracle/base.py file, in place of the line:

os.environ = '.UTF8'

Cheers,
Ian

--
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to
django-users+unsubscr...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/django-users?hl=en.

Offline

#10 Dec. 3, 2010 00:34:37

Anurag C.
Registered: 2009-11-02
Reputation: +  0  -
Profile   Send e-mail  

Django 1.2.3, Oracle with Character Set WE8MSWIN1252 - 'utf8' codec can't decode bytes


Hi Ian,

I just tried the ctypes solution that you mentioned in your previous email
but it does not work for me.

Below is my session transcripts.

>>> import ctypes
*>>> ctypes.CDLL('kernel32').SetEnvironmentVariableA('NLS_LANG', '.UTF8')*
*1*
>>> TerminologyMap.objects.get(term_id=8)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/site-packages/django/db/models/manager.py", line
132, in get
return self.get_query_set().get(*args, **kwargs)
File "/usr/lib/python2.5/site-packages/django/db/models/query.py", line
336, in get
num = len(clone)
File "/usr/lib/python2.5/site-packages/django/db/models/query.py", line
81, in __len__
self._result_cache = list(self.iterator())
File "/usr/lib/python2.5/site-packages/django/db/models/query.py", line
269, in iterator
for row in compiler.results_iter():
File "/usr/lib/python2.5/site-packages/django/db/models/sql/compiler.py",
line 672, in results_iter
for rows in self.execute_sql(MULTI):
File "/usr/lib/python2.5/site-packages/django/db/models/sql/compiler.py",
line 741, in <lambda>
result = iter((lambda: cursor.fetchmany(GET_ITERATOR_CHUNK_SIZE)),
File "/usr/lib/python2.5/site-packages/django/db/backends/oracle/base.py",
line 552, in fetchmany
for r in self.cursor.fetchmany(size)])
File "/usr/lib/python2.5/site-packages/django/db/backends/oracle/base.py",
line 625, in _rowfactory
value = to_unicode(value)
File "/usr/lib/python2.5/site-packages/django/db/backends/oracle/base.py",
line 636, in to_unicode
return force_unicode(s)
File "/usr/lib/python2.5/site-packages/django/utils/encoding.py", line 88,
in force_unicode
raise DjangoUnicodeDecodeError(s, *e.args)
django.utils.encoding.DjangoUnicodeDecodeError: 'utf8' codec can't decode
bytes in position 22-24: invalid
data. You passed in 'Registro guardado con \xe9xito' (<type 'str'>)

Regards,
Anurag

On Fri, Dec 3, 2010 at 4:47 AM, Ian Kelly <ian.g.ke...@gmail.com> wrote:

> On Thu, Dec 2, 2010 at 3:23 PM, Ian Kelly <ian.g.ke...@gmail.com> wrote:
> > Weird. From what I can tell, this seems to have something to do with
> > Cygwin, or at least I'm able to replicate it in that environment.
> > Setting NLS_LANG in or out of process and changing the registry key
> > all have no effect.
>
> The actual problem is described here:
>
>http://rubyforge.org/forum/forum.php?thread_id=6826&forum_id=1078>
> and from the cx-oracle-users mailing list comes this suggestion:
>
> >>> import ctypes
> >>> ctypes.CDLL('kernel32').SetEnvironmentVariableA('NLS_LANG', '.UTF8')
>
> I've tried it, and it works. I suggest patching the above into your
> django/db/backends/oracle/base.py file, in place of the line:
>
> os.environ = '.UTF8'
>
> Cheers,
> Ian
>

--
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to
django-users+unsubscr...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/django-users?hl=en.

Offline

  • Root
  • » Django
  • » Django 1.2.3, Oracle with Character Set WE8MSWIN1252 - 'utf8' codec can't decode bytes [RSS Feed]

Board footer

Moderator control

Enjoy the 17th of August
PoweredBy

The Forums are managed by develissimo stuff members, if you find any issues or misplaced content please help us to fix it. Thank you! Tell us via Contact Options
Leave a Message
Welcome to Develissimo Live Support