Training continues ...

 



After 48 hours of training the results are much better, but not enough for such a big answer. It can continue to converge. And the data collection used, must be processed by a human hand.

Also the model layers must be modified. Model size must be reduced and context must be added.

Training paralleling is not very efficient with four GPUs in four computers. We must replace one with a faster and with bigger memory like the RTX 4090 with 24 GB. 

Comments

Popular posts from this blog

The Evolution of Computers: From ENIAC to Quantum Computing

Exploring the Wonders of Science: The Faraday Museum at IST Lisbon

What was Faraday's impact on the field of electrochemistry and the study of chemical reactions?