Subject:
From: Ondřej Jirman
Date: Wed, 4 Mar 2020 00:29:33 +0100
On Tue, Mar 03, 2020 at 09:36:26AM -0800, Ethan Dalool wrote:
> Hi,
> 
>  
> 
> The fact of the matter is, I don't actually use Powershell normally. I was only using it to prove that the charset of my terminal wasn't the cause of the problem.
> 
>  
> 
> Actually, I discovered this bug because I'm calling megaput from Python's `subprocess` module. Specifically, my Python code is
> 
>  
> 
> command = [megaput, '--config', f'{config_file}', f'{filename}']
> 
> subprocess.check_output(command, stderr=subprocess.STDOUT, timeout=180)

Hello,

this means that you're passing filenames in UTF-8 encoding to megatools.

Therefore you need to configure the CHARSET environment variable accordingly.
Either try the MS value of CP65001 or UTF-8.

The https://megous.com/git/megatools/tree/README#n80 description is inaccurate
in so far as CHARSET envvar is also used for converting command line arguments
to UTF-8 from the encoding specified in CHARSET.

regards,
	o.

>  
> 
> I didn't mention this earlier because I wanted to get straight to the point, and not distract from the conversation with Python.
> 
>  
> 
> To be clear, I use subprocess with other programs on a daily basis, and they handle Unicode filenames ok. Even when I use windows cmd, which has an even more restrictive charset than powershell, it doesn't matter because subprocess passes the Unicode to the program properly. You said that glib is doing something -> utf-8 conversions, but from my experience with Python and subprocess, it should be receiving utf-8 from my calling process just fine. I have attached a screenshot demonstrating that I can use Unicode in my shell even when the shell can't display it.
> 
>  
> 
> Also, your link says this (emphasis mine):
> 
>  
> 
> > On Unix, the character sets are determined by consulting the environment variables G_FILENAME_ENCODING and G_BROKEN_FILENAMES.
> 
> > On Windows, the character set used in the GLib API is always UTF-8 and said environment variables have no effect.
> 
>  
> 
> I am a fellow programmer so I understand you're trying to close this ticket
> quickly. But I am quite sure I am doing everything properly for my system.
> Megaput is not accepting my Unicode input which is why I'm filing a bug
> report.



>  
> 
> Thanks,
> 
> Ethan
> 
>  
> 
> -----Original Message-----
> From: Ondřej Jirman <megatools@megous.com> 
> Sent: Tuesday, March 3, 2020 9:07 AM
> To: Ethan Dalool <ethan@voussoir.net>
> Subject: Re: 1.10.2 on Windows, megaput files with unicode names = error opening file
> 
>  
> 
> On Tue, Mar 03, 2020 at 08:22:11AM -0800, Ethan Dalool wrote:
> 
> > Hi, Ondřej. Thanks for your very fast response.
> 
> > 
> 
> > Your link says:
> 
> > 
> 
> > > This is just a cosmetic issue. Internally, megatools always work 
> 
> > > with UTF-8
> 
> > file names, and even if the tool's terminal output is corrupted, files 
> 
> > names of downloaded/uploaded files will be correct.
> 
> > 
> 
> > But the reason I sent this email is specifically because the files are 
> 
> > failing to upload. It's not just a cosmetic issue. Please see my error message again.
> 
> > 
> 
> > I don’t program in C++, but I know from Python experience that 
> 
> > attempting to `open()` a filename that contains invalid characters 
> 
> > yields the OS exception "Invalid argument". So when I see megatools 
> 
> > displaying questionmark filenames, even when I'm using Powershell 
> 
> > which capable of displaying UTF-8, and the "Cannot open file: Invalid Argument" exception, it makes me suspicious.
> 
>  
> 
> Hello,
> 
>  
> 
> none of the powershell stuff matters. The input/output from megatools is handled by glib library, which does utf8->something conversion when printing to the stdout and something->utf8 conversion when taking command line filename type arguments.
> 
>  
> 
> Glib uses some environment variables to decide what that something will be.
> 
> You need to have these environment variables set even under powershell, otherwise glib will cause a mess.
> 
>  
> 
> You can read about it here:  <https://developer.gnome.org/glib/stable/glib-Character-Set-Conversion.html> https://developer.gnome.org/glib/stable/glib-Character-Set-Conversion.html
> 
>  
> 
> regards,
> 
>                 o.
> 
>  
> 
> > -----Original Message-----
> 
> > From: Ondřej Jirman < <mailto:megatools@megous.com> megatools@megous.com>
> 
> > Sent: Tuesday, March 3, 2020 3:12 AM
> 
> > To: Ethan Dalool < <mailto:ethan@voussoir.net> ethan@voussoir.net>
> 
> > Subject: Re: 1.10.2 on Windows, megaput files with unicode names = 
> 
> > error opening file
> 
> > 
> 
> > Hello,
> 
> > 
> 
> > On Tue, Mar 03, 2020 at 12:42:48AM -0800, Ethan Dalool wrote:
> 
> > > Hi,
> 
> > > 
> 
> > >  
> 
> > > 
> 
> > > First of all, thank you for megatools. I think it's the best 
> 
> > > software of its kind for interacting with mega.
> 
> > > 
> 
> > 
> 
> > I'm glad megatools works for you.
> 
> > 
> 
> > > 
> 
> > > I'm having some trouble using megaput to upload files with Unicode 
> 
> > > characters in the filename. It gives me this error:
> 
> > > 
> 
> > >  
> 
> > > 
> 
> > > > D:\software\megatools\1.10.2\megaput.exe --config mega.ini
> 
> > > "C:\outbox\ ȳ  ϼ   .zip"
> 
> > > 
> 
> > >  
> 
> > > 
> 
> > > ERROR: Upload failed for 'C:\outbox\?????.zip': Can't read local 
> 
> > > file
> 
> > > C:\outbox\?????.zip: Error opening file C:\outbox\?????.zip: Invalid 
> 
> > > argument
> 
> > > 
> 
> > 
> 
> >  <https://megous.com/git/megatools/tree/README#n80> https://megous.com/git/megatools/tree/README#n80
> 
> > 
> 
> > does this help?
> 
> > 
> 
> > regards,
> 
> >             o.
> 
> > 
> 
> > > 
> 
> > > I notice that the Unicode characters are being replaced by ? even 
> 
> > > though Powershell is capable of displaying them, which leads me to 
> 
> > > believe that somewhere internally, megaput is escaping the filename, 
> 
> > > converting out-of- page characters to ? prior to upload, and then 
> 
> > > choking on it. I know that many programs do this kind of escaping 
> 
> > > for display purposes but clearly this escaped name shouldn't be going to the upload routine.
> 
> > > 
> 
> > >  
> 
> > > 
> 
> > > I hope this issue will be simple to resolve, and if I can provide 
> 
> > > anything else to make it easier please let me know.
> 
> > > 
> 
> > >  
> 
> > > 
> 
> > > Thanks
> 
> > > 
> 
> > 
>