Training continues ...

May 16, 2023

After 48 hours of training the results are much better, but not enough for such a big answer. It can continue to converge. And the data collection used, must be processed by a human hand.

Also the model layers must be modified. Model size must be reduced and context must be added.

Training paralleling is not very efficient with four GPUs in four computers. We must replace one with a faster and with bigger memory like the RTX 4090 with 24 GB.

Search This Blog

STELLA (Science, Technology, Engineering, and Mathematics Teaching and Learning Advancement)

Training continues ...

Comments

Post a Comment

Popular posts from this blog

Celebrating Two Years of ChatGPT: A Virtual Companion Changing our World

Navigating the Maze of Cerebral Vascular Accident (CVA): Causes, Prevention, and Critical Response

The Evolution of Computers: From ENIAC to Quantum Computing