For this activity, I learnt how the spaCy library works and what kind of functionality it brings for Natural Language Processing. Honestly, I was surprised by how much it abstracts the processes that we were taught to implement manually. Though, it is still a good thing that we were taught HOW it works first because it satisfies my curiousity. Overall, it did reinforce the idea to never "reinvent the wheel." While its great to know how to implement a process, there's probably already a library that implements said process. More often than not, these lirbaries have been tested by multiple people and it has been ironed out to be as stable and efficient as possible.
My one pain point from this activity is the fact that I had to comb through my text data manually in order to single out the named entities in it. It took me a great while to just scan through the news articles I picked, and even then I still managed to miss some entities when I read through it for the second and third time. I'm just glad that we only had to pick three articles and not ten or more articles.
Overall, the activity helped enlighten me on how NLP would look like when I aim to start a project implementing said process. The one thing I'm excited to figure out and explore is customizing the process to the domain of my choosing. Until now, we've just been using datasets that have been "general" in a sense. I wanna see how NLP would be custom-fitted to understand niche information of my chosen domain.