Training continues ...
Also the model layers must be modified. Model size must be reduced and context must be added.
Training paralleling is not very efficient with four GPUs in four computers. We must replace one with a faster and with bigger memory like the RTX 4090 with 24 GB.
Comments
Post a Comment