Exploring the Evolution of Technology through Wikipedia

by Salar Ather

 

Introduction

I decided to use Wikipedia pages to analyze how technology has evolved over the past 17 years (since Wikipedia came into existence). I wanted to analyze how technology that has been around for a long time – such as the radio – has changed over time and how more recent and emerging technologies have changed. I am interested in this topic because I use a lot of emerging technologies that are constantly changing and I wanted to see how these technologies compare to old ones as well as try to figure out what point in time changes in technologies were the most frequent – that would help me understand the pivoting point of technological advancements.

Page Selection

I grabbed the pages from Wikipedia and tried to create a diverse set. What I mean by a diverse set is that I chose technologies from different fields with a particular focus on telecommunication because I feel like the way we connect has changed drastically over the past couple decades. Other pages chosen range from technologies in medicine, personal computers, memory, entertainment, gaming, artificial intelligence etc.

I tried to order the pages by how recent the technologies are – but this is not perfect. At the top you will find more recent and emerging technologies and as you go down the list you will find technologies that have existed for some time now. I decided to do that because I am expecting to see a pattern in emerging technologies vs old/existing technologies and this seemed like the natural order to go through.

Key Observations

There were some interesting finds in the collection of pages. I’ll point out some that really intrigued me:

    For the Speech Recognition Wikipedia page, it was very interesting to note that the word ‘neural’ did not start appearing as a top word for the page till 2012 – which is when artificial neural networks started getting popular because of higher computing power. The word ‘deep’ started becoming a top word around 2015, when the term deep learning started becoming popular. The term ‘lstm’ appeared as a top word on the page in 2016 which might be because by 2016 major technology companies including Google, Apple and Microsoft were using LSTMs as fundamental components in their products such as Siri and Amazon Alexa and also because of multiple papers being published which used LSTMs in their neural network architectures.
    For the Wikipedia page on the Internet, one thing that was very interesting was the use of the word cafes becoming very popular in 2003 and 2004 under the section ‘Access’ or ‘Internet Access’. It’s interesting to note that the section on Internet Access was removed as a separate section and placed under Infrastructure in 2014 and has been that way since symbolizing the increased ease of access to the internet in the modern world. Social Impact and Security appeared as separate sections in 2010 and 2014 respectively showcasing the increased sensitivity to the impact of the internet. The word ‘chat’ was pretty popular between 2006 and 2008 and even appeared in the introductory paragraph in those years.
    The page on video game consoles was something I was looking forward to analyze. It’s interesting to note that for the first couple of years, the first paragraph for video game consoles described them as devices that are separate televisions. Since 2006, Nintendo has been a top word used because of the company’s dominance on video game consoles, specifically their handheld – a word that also became popular in recent years – versions such as the Gameboy.

GameBoy Advance: The most popular handheld device in history
    Personal Computer was the big one I wanted to look at. One notable thing was how the section headers changed over time. Non IBM Compatible Computers appeared to be a popular section heading showcasing IBM’s dominance in the market but the section header was taken off in 2007. The word ‘laptop’ became a popular word used frequently after 2008 which is also when more focus was put on Hardware and Software as they were added as separate sections to the page. ‘Windows’ has been a popular word since 2008 and still appears as one of the top words. The number of characters used on the page increased drastically between 2004 and 2010 but ever since the increase has slowed down and in some consecutive years even halted.
    The Self Driving car was referred to as ‘science fiction’ in 2003 and 2004. In 2005, it was referred to as a theoretical technology with the page only having 6405 characters. In 2006 though, it started being referred to as ‘an emerging family of technologies’ with the number of characters for the page as well as the number of sections increasing more than tenfold compared to the last year. After a little research, I found out that this was the case because in 2005, the Defense Advanced Research Project Agency held its second DARPA Grand Challenge offering a $2 million dollar prize to the first driverless car to complete the course and for the first time (no vehicle finished the course on the first challenge held in 2005), vehicles were able to finish the course.

Stanley, the winner of the second DARPA Grand Challenge
    'Broadband’ has had several different definitions on Wikipedia over the years and in 2008 and 2009, the introductory paragraph even read, “The term broadband can have different meanings in different contexts. The term's meaning has undergone substantial shifts.” Other wirless technologies, like 4g, till 2008 were referred to as a theoretical. That changed in 2009 for 4g as it was launched commercially for the first time.
    An actual page for cryptocurrency did not exist till 2013 with the ‘Economics’ section header being added and becoming immensely more popular in the last couple of years.
    For the page on mobile phones, we see that information keeps getting appended to the definition of a mobile phone and little is taken off. The word smartphone first appears in the definition of a mobile phone in 2015, a time when more than 70% of Americans owned smartphones. The word ‘nokia’ was very popular between the years 2010 and 2016 as nokia was the largest producer of mobile phones including the first ‘smart’ phone the Nokia 9000 Communicator.

Early evolution of the cellphone
    Cluster 8 is interesting as it groups all artificial intelligence based speech recognition pages together. Cluster 0 seems to group together technologies related to communication.

Summary

It was interesting to note that certain words started appearing more often as Wikipedia pages got revised while other words would become less popular and this had mostly to do with how the technologies were advancing and/or changing with time. There did not seem to be much of a time lag for that too. For example, the word ‘neural’ became a lot more frequent for pages related to artificial intelligence models as neural networks became increasingly more popular and the word ‘nokia’ was very popular in the time when the company was dominating the mobile phone market.

The number of characters used in Wikipedia pages also had a positive correlation with more widespread real life applications of those technologies such as self-driving cars but that is a natural conclusion I would have reached anyway. What’s more interesting is how the introductory paragraphs changed over time for certain technologies. For devices such as the mobile phone, with increasing functionality, the introductory paragraphs would just add more information but for technologies like broadband the introductory paragraph had different definitions for the technology at different times. It’s also interesting how definitions changed as technologies became less theoretical and started having real life applications.

Section headers for technologies seem to indicate the popularity and interest of the general population in those particular areas of the technology. For example, non IBM compatible computers seemed something that the general population would be interested in knowing at a time when IBM was dominating the market and how Internet Access was a separate section in the early 2000s when people actually had less access to the internet but was later removed and placed under a different section header as getting access to the internet became easier and the demand for information on internet access also decreased. The addition of the Economics section header for cryptocurrencies in 2013 also testifies to this hypothesis as people became more interested in cryptocurrencies as forms of investment.

The difference between emerging/new and existing/old technologies is mostly evident in the introductory paragraphs which were fairly constant or being appended to for existing technologies but would change drastically for emerging technologies especially ones that were theoretical at some point in time before real life applications came out. The top words list for existing technologies show how different aspects of those technologies were more popular at different times while buzzwords were more popular for emerging technologies such as speech recognition and self-driving cars.