Context
I have a gradio app called text-generation-webui and user need to have python to start it. For normal people, they do not have python installed in their computer, so I need to package the python environment into the app. We use conda to manage the python environment and we use conda-pack
to package the environment.
I create a GUI using gooey to start python cli in a subprocess. The code is like this:
import signal
class YourClass:
def __init__(self):
self.stdout = []
# 设置SIGTERM信号处理程序
signal.signal(signal.SIGTERM, self.save_stdout)
def save_stdout(self, signum, frame):
# 保存self.stdout的逻辑
with open('output.txt', 'w') as file:
file.write('\n'.join(self.stdout))
print("Saved stdout to file")
def run(self, cmd):
self.process = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
shell=True,
text=True,
)
while True:
output = self.process.stdout.readline()
if output:
self.stdout.append(output.strip())
print(output.strip())
if self.process.poll() is not None and not output:
break
return self.process.returncode
# 示例
your_class_instance = YourClass()
your_class_instance.run("your command here")
This code can capture all the output from the subprocess, it worked fine in the Mac but not working in the windows and it took me a while to figure out the reason.
Problem
When I download the new model, it would throw the following exception:
C:\Users\tczhong\Documents\LLM\new-web-ui-one-click\text-generation-webui\download-model.py”, line 153, in get_single_file with tqdm.tqdm(total=total_size, unit=‘iB’, unit_scale=True, bar_format=‘{l_bar}{bar}| {n_fmt:6}/{total_fmt:6} {rate_fmt:6}’) as t: File “C:\Users\tczhong\Documents\LLM\new-web-ui-one-click\installer_files\env\lib\site-packages\tqdm\std.py”, line 1137, in exit self.close() File
“C:\Users\tczhong\Documents\LLM\new-web-ui-one-click\installer_files\env\lib\site-packages\tqdm\std.py”, line 1299, in close self.display(pos=0) File
“C:\Users\tczhong\Documents\LLM\new-web-ui-one-click\installer_files\env\lib\site-packages\tqdm\std.py”, line 1492, in display self.sp(self.str() if msg is None else msg) File
“C:\Users\tczhong\Documents\LLM\new-web-ui-one-click\installer_files\env\lib\site-packages\tqdm\std.py”, line 347, in print_status fp_write(‘\r’ + s + (’ ’ * max(last_len[0] - len_s, 0))) File
“C:\Users\tczhong\Documents\LLM\new-web-ui-one-click\installer_files\env\lib\site-packages\tqdm\std.py”, line 340, in fp_write fp.write(str(s)) File
“C:\Users\tczhong\Documents\LLM\new-web-ui-one-click\installer_files\env\lib\site-packages\tqdm\utils.py”, line 127, in inner return func(*args, **kwargs) OSError: [Errno 22] Invalid argument
It always stuck in about 2% in the downloading and pretty strange.
Debug
I spent a lot of time in the tqdm and I thought it is the problem since it is what exception said. However it is not the reason.
The subprocess itself crashed and the real exception is:
gbk' codec can't decode byte 0x8f in position 8: illegal multibyte sequence
It happened when reading stdout and it could not decode it.
Solution
There are lots of places we can set up encoding.
Popoen
We can set the encoding in the Popen
function:
self.process = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
shell=True,
text=True,
encoding='utf-8'
)
Gooey
We can set the encoding in the gooey decorator:
@Gooey(
program_name="Text Generation",
default_size=(800, 600),
encoding='utf-8'
)
python
When we start the python, we can force it using utf-8 encoding:
python -X utf8
codecs writter
We can also use codecs to write the file:
import sys
import codecs
if sys.stdout.encoding != 'UTF-8':
sys.stdout = codecs.getwriter('utf-8')(sys.stdout.buffer, 'strict')
if sys.stderr.encoding != 'UTF-8':
sys.stderr = codecs.getwriter('utf-8')(sys.stderr.buffer, 'strict')
Conclusion
Using codecs writter solve the problem.
It is so frustrating to debug this problem and I hope this blog can help you if you have the same problem.