Fine-tuning with limited hardware resources

In 2023, we are witnessing a boom in language models and their practical applications. ChatGPT has sparked interest in replicating its success, and many teams have published the results of their work. A large portion of the new models have been released under the Apache 2.0 license, which allows for their free modification, use, and …

PyTorch: dividing dataset, transformations, training on GPU and metric visualization

In machine learning designing the structure of the model and training the neural network are relatively small elements of a longer chain of activities. We usually start with understanding business requirements, collecting and curating data, dividing it into training, validation and test subsets, and finally serving data to the model. Along the way, there are …

Data preparation with Dataset and DataLoader in Pytorch

Preparing your data for machine learning is not a task that most AI professionals miss. Data are of different quality, most often they require very thorough analysis, sometimes manual review, and certainly selection and initial preprocessing. In the case of classification tasks, the division of a dataset into classes may be inappropriate or insufficiently balanced. …

