Python, the text editor, and encoding

August 26th, 2006 by Bernard Lebel - Viewed 18622 times -




Recently I ran into a severe problem. Whenever I would try to import modules in the Python command line shell, I’d get syntax error pointing to the first line. In fact, trying to import the module in XSI from a custom command gave a syntax error at line… zero! What the?

So I started trouble-shooting, and it finally came down to “as soon as there was a single character in the file, I would get these errors”. No matter what I would write in the file, either a “pass” statement or a # character, I would get the error. Also, while Python was reporting an error, it also gave me “no encoding declared” messages.

>>> from lighting.LightTools.modify import LT_ReorientInfiniteLight2
__main__:1: DeprecationWarning: Non-ASCII character ''\xff'' in file E:\workgroup\Data\Scripts\lighting\LightTools\modify\LT_ReorientInfiniteLight2.py o
n line 1, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "E:\workgroup\Data\Scripts\lighting\LightTools\modify\LT_ReorientInfiniteLight2.py", line 1
     ?p
    ^
SyntaxError: invalid syntax

Notice the black square before the p. That made no sense, as the p was the first character (a “pass” statement).

I tried every possible thing, like ovewriting the file with a new one, etc. Basically I spent few hours to try to solve an impossible syntax error.

However I tried another text editor (UltraEdit), and I no longer had the error. So it became clear that my first text editor, SciTE (Scintilla Text Editor), was at fault.

So in SciTE, I went to File > Encoding, and I notice it is set to one of the two UCS encodings. I set it to Default.

Problem gone.

8 Responses to “Python, the text editor, and encoding”

  1. Daniele Niero says:

    Yes, this is a common problem with python. Personally I consider python a tool with very nice features, but with an horrible syntax. The “indent part of the syntax” approach instead using the most common brackets expose our code to this type of stupid syntax error. So is necessary using a good ide when coding in python. I suggest Komodo (commercial) or Eclipse with PyDev extensions.

  2. I don”t see what the syntax has to do with the encoding or the problem I have reported. But thanks for the suggestions.

    Bernard

  3. Daniele Niero says:

    Maybe I missed the real problem, but it looks like an indentation problem. Different encoding = different way to indent the words… or am I wrong? since the indentation is part of the syntax, python is particularly expose to this kind of problem. This was my point… :)

  4. Daniele Niero says:

    Yes! I have missed the real problem :) It doesn”t have to do with what I said :)
    Sorry Bernard.
    But now I”m very curios about that, and I tried to replicate the problem without any luck (apparently I still missing what exactly is). Can you post a simple code that generate this kind of error?

    Cheers,
    Daniele

  5. Luc-Eric says:

    It because the file was saved as a unicode text file.
    the first bytes of a unicode text file are : 0xFF 0xFE,
    this is how editors like notepad know that the file isn”t ASCII

  6. ekeko says:

    Hi:
    going straight to the problem and not losing myself in nonsense comments:

    did you try with (encoding example):

    # -*- coding: iso-8859-1 -*-

    at the very first line of the file?
    check your |python-dir]\Lib\encodings\string_escape.py for a running example

  7. diego nunes says:

    . . Don’t blame poor SciTE. It’s my editor of choice since ever. The problem is that you’re using non-ascii string in a Python source code without saying that to the compiler. PEP 263 (http://www.python.org/dev/peps/pep-0263) says you have a lot of options to do that (I, personally, like to use the second suggestes option).
    . . As Python is really cool to newbies, if you save your file as UTF-8 with BOM, Python will automatically understand that he’s dealing with UTF-8 encoding and treat the strings properly. If you change the encoding in SciTE to “UTF-8 with BOM” or simply declare you encoding explicitly, you won’t have any problems.

  8. Jorgen says:

    I got this problem by creating my file using “echo “HI” > myfile.py” using powershell (new to Windows 7…). When i copied all the code into a new file using vim the problem disappeared.