Ch3: Polls and Predictions( classification or regression?)
On the edge of her seat, Amma watches the election results unfold on TV, her eyes flickering with anticipation. In their cozy living room, the tension is palpable. Next to her, Meenu tries to catch up on some work, but her mother's excitement is contagious.
"Meenu, are you following the AP elections? I'm very nervous and excited to see who will become the chief minister this term," Amma exclaims, her voice a mix of anxiety and thrill.
"Yes, Amma, I've been following. It seems to be a tight race between the two parties," replies Meenu, glancing at the television screen that flashes vibrant graphs and numbers.
"If only we had a machine learning model that could predict which party will win and how many seats each will secure based on previous data. It would be supervised learning, right?" Amma muses, her newly acquired tech jargon making Meenu smile.
"Wow, Mom, you've picked up on that pretty quickly! Yes, you're right, it is supervised learning. Predicting whether a party wins or not is a classification problem, and predicting the number of ministers from each party is a regression problem," Meenu explains, impressed by her mother’s grasp of the concepts.
Amma’s curiosity peaks. "What are classification and regression? Is there more to supervised learning?" she asks, eager to understand more.
"Definitely, Amma. Let's start with regression. This type involves predicting a continuous value. For example, consider you have data on various houses, including their size, number of bedrooms, location, and price. Here, the goal is to predict the price of a house based on its features. Price is a continuous value. Connecting this to our election results, the number of candidates who will win from each party is a regression problem because the number is continuous."
"And classification?" Amma interjects, her interest peaking.
"Classification involves predicting a label into discrete classes. It can be two-class or multi-class. An example of two-class classification is if you have data on students and their marks and whether they passed or failed a previous test. Based on this, you can predict whether a student will pass the upcoming test or not. A multi-class example is when you have images of handwritten digits from 0 to 9 and their corresponding labels. The goal here is to identify which digit a new handwritten image represents, and here, there are 10 classes."
"Now, in the context of our election, predicting which party wins is a two-class classification problem, as the outcome is either win or lose."
Intrigued, Amma ponders, "And how does a model know if it's a regression or classification problem?"
"That depends on the type of data and the function of the model. There are specific types of modeling and activation functions tailored for each," Meenu says, noting her mother's overwhelmed expression.
"Let’s leave the deeper terminology for another day. I think I need to sit down with a book to note all this down. For now, I'll just see if the party we voted for is classified to win or not," Amma chuckles, her eyes returning to the screen, now armed with a little more understanding and a lot more curiosity about the intersection of technology and her world of politics.