Finding The Most Popular Cartoon

Hi ML enthusiasts! Today, we will solve a case study on finding the most popular cartoon among the list of cartoons by using Python language.

So, we are given the data in the form of a list of dictionaries which has Id and the name of cartoon like this:
cartoons = [
{ "id": 1, "name": "Oswald" },
{ "id": 2, "name": "Shin Chan" },
{ "id": 3, "name": "Powerpuff Girls" },
{ "id": 4, "name": "Pokemon" },
{ "id": 5, "name": "Bob The Builder" },
{ "id": 6, "name": "Noddy" },
{ "id": 7, "name": "Power Rangers" },
{ "id": 8, "name": "Transformers" }

As you can see, it’s a list of dictionaries with the keys being id and name. We are also given a connection chart to let us know about the connections between these cartoons.


The image above shows the way the cartoons are connected to each-other(in some way or the other). The numbers written are the ids of the cartoons. This can also be written in the form of tuples by using the following code:

#List of tuples describing the connections between cartoons
Connections = [(1, 2), (2, 3), (3, 4), (3, 5), (3, 6), (3, 8),
(4, 6), (5, 8), (6, 7), (6, 8)]
#Creating a new key "Connection" in cartoons and assigning values as blank
for cartoon in cartoons:
    cartoon["Connection"] = []

"""Here we are referring to the tuples in Connections and adding the names of connected cartoons
in Connection key of the dictionaries in cartoons.
We have subtracted 1 from ii and jj because indexing in list starts from 0 in python."""
for ii, jj in Connections:
    cartoons[ii - 1]["Connection"].append(cartoons[jj - 1]["name"])
    cartoons[jj - 1]["Connection"].append(cartoons[ii - 1]["name"])

#Returning the number of conncections each cartoon has by means of a funciton
def num_of_connections(crtoon):
    return len(crtoon["Connection"])

#returning total connections in the list of dictionaries
#Calculating total number of connections
total = sum(num_of_connections(cartoon) for cartoon in cartoons)

#Finding the length of cartoons list
length_of_cartoons = len(cartoons)

#Finding average_number_of_connections
average_number_of_connections = total/length_of_cartoons

#Finding most popular cartoon
connections_plus_id = [(cartoon["id"], num_of_connections(cartoon)) for cartoon in cartoons]

#sorting connections on the basis of number of connections
sorted_connections_plus_id = sorted(connections_plus_id,
                                     key = lambda connections_plus_id: connections_plus_id[1],

We have created an empty list which is added as “connection” key in our list of dictionaries. This list is then appended with the names of the connected cartoons as per our logic above. Please note that the tuples in the connections are not repeatable. So, every name is added only once into the list. The outputs of the code is given below:

cartoons: [{'id': 1, 'name': 'Oswald', 'Connection': ['Shin Chan']},
{'id': 2, 'name': 'Shin Chan', 'Connection': ['Oswald', 'Powerpuff Girls']}, {'id': 3, 'name': 'Powerpuff Girls', 'Connection': ['Shin Chan', 'Pokemon', 'Bob The Builder', 'Noddy', 'Transformers']},
{'id': 4, 'name': 'Pokemon', 'Connection': ['Powerpuff Girls', 'Noddy']},
{'id': 5, 'name': 'Bob The Builder', 'Connection': ['Powerpuff Girls', 'Transformers']},
{'id': 6, 'name': 'Noddy', 'Connection': ['Powerpuff Girls', 'Pokemon', 'Power Rangers', 'Transformers']},
{'id': 7, 'name': 'Power Rangers', 'Connection': ['Noddy']},
{'id': 8, 'name': 'Transformers', 'Connection': ['Powerpuff Girls', 'Bob The Builder', 'Noddy']}]
total: 20
length_of_cartoons: 8
average_number_of_connections: 2.5
connections_plus_id: [(1, 1), (2, 2), (3, 5), (4, 2), (5, 2), (6, 4), (7, 1), (8, 3)]
sorted_connections_plus_id: [(3, 5), (6, 4), (8, 3), (2, 2), (4, 2), (5, 2), (1, 1), (7, 1)]

By looking at the output above, we can clearly say that cartoon with id = 3 or name = Powerpuff Girls has the maximum number of connections. Thus, they are the most popular!

So,, this was our first data science case study using Python. To solve more case studies related to data science, machine learning  and neural networks, stay tuned!

For more updates and news related to this blog as well as to data science, machine learning and data visualization, please follow our facebook page by clicking this link.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s