Subject:
From: Ethan Dalool
Date: Tue, 3 Mar 2020 16:19:41 -0800
Hi Ondřej,

 

I really appreciate your time and the software that you've created, but I feel that you are responding to my emails without fully recognizing the problem that I am showing. I am a competent user, a programmer, I deal with Unicode on the command line daily, and I'm aware of the limitations and common problems dealing with Unicode on the command line. But this megaput problem is unique to me. The error message being shown on the screen is a message I recognize when a software tries to open illegal characters.

 

I am attaching a series of screenshots showing every possible incantation of megaput in cmd, powershell, and python subprocess; with charset=UTF-8, CP65001, and 65001; assigned in-shell via variable name, in-shell via the chcp command, and via the system environment variable editor. All of them have the same issue.

 

I only found ONE unique result, which is to set CHARSET=UTF-16. This creates a different error message (megaput.exe:21244): GLib-CRITICAL **: 16:14:31.850: ÿ_g, but it still does not upload.

 

Perhaps there is a correct incantation here somewhere, but this glib library is behaving differently than any other Unicode-enabled piece of command line software that I use.

 

Thanks,

Ethan

 

-----Original Message-----
From: Ondřej Jirman <megatools@megous.com> 
Sent: Tuesday, March 3, 2020 3:30 PM
To: Ethan Dalool <ethan@voussoir.net>
Subject: Re: 1.10.2 on Windows, megaput files with unicode names = error opening file

 

On Tue, Mar 03, 2020 at 09:36:26AM -0800, Ethan Dalool wrote:

> Hi,

> 

>  

> 

> The fact of the matter is, I don't actually use Powershell normally. I was only using it to prove that the charset of my terminal wasn't the cause of the problem.

> 

>  

> 

> Actually, I discovered this bug because I'm calling megaput from 

> Python's `subprocess` module. Specifically, my Python code is

> 

>  

> 

> command = [megaput, '--config', f'{config_file}', f'{filename}']

> 

> subprocess.check_output(command, stderr=subprocess.STDOUT, 

> timeout=180)

 

Hello,

 

this means that you're passing filenames in UTF-8 encoding to megatools.

 

Therefore you need to configure the CHARSET environment variable accordingly.

Either try the MS value of CP65001 or UTF-8.

 

The  <https://megous.com/git/megatools/tree/README#n80> https://megous.com/git/megatools/tree/README#n80 description is inaccurate in so far as CHARSET envvar is also used for converting command line arguments to UTF-8 from the encoding specified in CHARSET.

 

regards,

                o.

 

>  

> 

> I didn't mention this earlier because I wanted to get straight to the point, and not distract from the conversation with Python.

> 

>  

> 

> To be clear, I use subprocess with other programs on a daily basis, and they handle Unicode filenames ok. Even when I use windows cmd, which has an even more restrictive charset than powershell, it doesn't matter because subprocess passes the Unicode to the program properly. You said that glib is doing something -> utf-8 conversions, but from my experience with Python and subprocess, it should be receiving utf-8 from my calling process just fine. I have attached a screenshot demonstrating that I can use Unicode in my shell even when the shell can't display it.

> 

>  

> 

> Also, your link says this (emphasis mine):

> 

>  

> 

> > On Unix, the character sets are determined by consulting the environment variables G_FILENAME_ENCODING and G_BROKEN_FILENAMES.

> 

> > On Windows, the character set used in the GLib API is always UTF-8 and said environment variables have no effect.

> 

>  

> 

> I am a fellow programmer so I understand you're trying to close this 

> ticket quickly. But I am quite sure I am doing everything properly for my system.

> Megaput is not accepting my Unicode input which is why I'm filing a 

> bug report.

 

 

 

>  

> 

> Thanks,

> 

> Ethan

> 

>  

> 

> -----Original Message-----

> From: Ondřej Jirman < <mailto:megatools@megous.com> megatools@megous.com>

> Sent: Tuesday, March 3, 2020 9:07 AM

> To: Ethan Dalool < <mailto:ethan@voussoir.net> ethan@voussoir.net>

> Subject: Re: 1.10.2 on Windows, megaput files with unicode names = 

> error opening file

> 

>  

> 

> On Tue, Mar 03, 2020 at 08:22:11AM -0800, Ethan Dalool wrote:

> 

> > Hi, Ondřej. Thanks for your very fast response.

> 

> > 

> 

> > Your link says:

> 

> > 

> 

> > > This is just a cosmetic issue. Internally, megatools always work

> 

> > > with UTF-8

> 

> > file names, and even if the tool's terminal output is corrupted, 

> > files

> 

> > names of downloaded/uploaded files will be correct.

> 

> > 

> 

> > But the reason I sent this email is specifically because the files 

> > are

> 

> > failing to upload. It's not just a cosmetic issue. Please see my error message again.

> 

> > 

> 

> > I don’t program in C++, but I know from Python experience that

> 

> > attempting to `open()` a filename that contains invalid characters

> 

> > yields the OS exception "Invalid argument". So when I see megatools

> 

> > displaying questionmark filenames, even when I'm using Powershell

> 

> > which capable of displaying UTF-8, and the "Cannot open file: Invalid Argument" exception, it makes me suspicious.

> 

>  

> 

> Hello,

> 

>  

> 

> none of the powershell stuff matters. The input/output from megatools is handled by glib library, which does utf8->something conversion when printing to the stdout and something->utf8 conversion when taking command line filename type arguments.

> 

>  

> 

> Glib uses some environment variables to decide what that something will be.

> 

> You need to have these environment variables set even under powershell, otherwise glib will cause a mess.

> 

>  

> 

> You can read about it here:  

> <https://developer.gnome.org/glib/stable/glib-Character-Set-Conversion

> .html> 

>  <https://developer.gnome.org/glib/stable/glib-Character-Set-Conversion> https://developer.gnome.org/glib/stable/glib-Character-Set-Conversion.

> html

> 

>  

> 

> regards,

> 

>                 o.

> 

>  

> 

> > -----Original Message-----

> 

> > From: Ondřej Jirman < < <mailto:megatools@megous.com> mailto:megatools@megous.com> 

> >  <mailto:megatools@megous.com> megatools@megous.com>

> 

> > Sent: Tuesday, March 3, 2020 3:12 AM

> 

> > To: Ethan Dalool < < <mailto:ethan@voussoir.net> mailto:ethan@voussoir.net>  <mailto:ethan@voussoir.net> ethan@voussoir.net>

> 

> > Subject: Re: 1.10.2 on Windows, megaput files with unicode names =

> 

> > error opening file

> 

> > 

> 

> > Hello,

> 

> > 

> 

> > On Tue, Mar 03, 2020 at 12:42:48AM -0800, Ethan Dalool wrote:

> 

> > > Hi,

> 

> > > 

> 

> > >  

> 

> > > 

> 

> > > First of all, thank you for megatools. I think it's the best

> 

> > > software of its kind for interacting with mega.

> 

> > > 

> 

> > 

> 

> > I'm glad megatools works for you.

> 

> > 

> 

> > > 

> 

> > > I'm having some trouble using megaput to upload files with Unicode

> 

> > > characters in the filename. It gives me this error:

> 

> > > 

> 

> > >  

> 

> > > 

> 

> > > > D:\software\megatools\1.10.2\megaput.exe --config mega.ini

> 

> > > "C:\outbox\ ȳ  ϼ   .zip"

> 

> > > 

> 

> > >  

> 

> > > 

> 

> > > ERROR: Upload failed for 'C:\outbox\?????.zip': Can't read local

> 

> > > file

> 

> > > C:\outbox\?????.zip: Error opening file C:\outbox\?????.zip: 

> > > Invalid

> 

> > > argument

> 

> > > 

> 

> > 

> 

> >  < <https://megous.com/git/megatools/tree/README#n80> https://megous.com/git/megatools/tree/README#n80> 

> >  <https://megous.com/git/megatools/tree/README#n80> https://megous.com/git/megatools/tree/README#n80

> 

> > 

> 

> > does this help?

> 

> > 

> 

> > regards,

> 

> >             o.

> 

> > 

> 

> > > 

> 

> > > I notice that the Unicode characters are being replaced by ? even

> 

> > > though Powershell is capable of displaying them, which leads me to

> 

> > > believe that somewhere internally, megaput is escaping the 

> > > filename,

> 

> > > converting out-of- page characters to ? prior to upload, and then

> 

> > > choking on it. I know that many programs do this kind of escaping

> 

> > > for display purposes but clearly this escaped name shouldn't be going to the upload routine.

> 

> > > 

> 

> > >  

> 

> > > 

> 

> > > I hope this issue will be simple to resolve, and if I can provide

> 

> > > anything else to make it easier please let me know.

> 

> > > 

> 

> > >  

> 

> > > 

> 

> > > Thanks

> 

> > > 

> 

> > 

>