July 13, 2020

Diving into NLP

It has passed quite some time since my last post and a lot has happened since.

During the Corona quarantine, I decided to give my data career a new start.

The whole world situation got me reflecting on the things that matter to me and I realized that doing frontend development felt quite superficial. I felt the hunger to do something more meaningful and that is why I gave data science and AI a new chance.

Don't get me wrong, I still love frontend development, it is a strong passion for me and I will never give up on it. I just want to do more. I want to help. I want to give more value. I want to help the world to become a better place.

What Have I Been Doing?

I finished the Deep Learning by Andrew Ng specialization during June (much faster than I had planned). It was a game-changer and I would recommend it to everybody wanting to learn deep learning.

I would also recommend taking the Practical Deep Learning for Coders by fast.ai at the same time.

They both teach the same(ish) subjects, but Andrew Ng uses a bottoms-up approach and fast.ai uses a top-down approach. For me it was a game-changer to learn the same subjects with the opposite approaches.

I am more of a top-down learner myself and sometimes it is really hard to keep the up motivation with bottoms-up style learning. That was one of the reasons why I failed the last time that I tried to learn data science. I thought that you can only learn it bottoms-up. This time I am learning by doing and I am excited all the time and learning much faster.

Diving into NLP

I started searching for an employer for my thesis in April. I had planned to do a journal-style thesis about the process of learning machine learning, but in June after tens of applications and a lot of stress I got lucky and one of my preferred companies contacted me with a proposal for a thesis project related to NLP.

I had been interested in computer vision and natural language processing, but could not decide which one to focus on. Most of the machine learning courses focus on computer vision. It seems to be seen as the "sexier" subject.

For me though, NLP seems to be the subject that has more to give. Getting computers to understand human languages and the skill to communicate with us will open so many doors for us. In my opinion we have much more unexamined areas in NLP than we have in computer vision.

The more I learn about NLP, the more I get sucked into it. There are so many opportunities in NLP that we have still not conquered. We don't even have many models that work with other languages than English. One of my personal interests is Finnish NLP projects. The Finnish language is so different from English and there are only a handful of models that work with the Finnish language.

Next Steps

I started working on my thesis during June and it is planned to be ready in November. I have planned to work on the project until September and the thesis should be published at the latest in November.

I will give more information about the project as it proceeds. It focuses on the semantic similarities between two or more different documents.

I created a repository to document my learning process. There is not much content yet, but I will put everything that I do into that repository with all the links and sources that I use.

Back to blog index