Big Data and Data Analytical Tools Learning Experience

 In this blog, I will discuss and share the basics of big data and data analytics. Share a comparison of public data sources and will explain which one I will likely use in the future. Also, provide a comparison of no-code data analytics tools and share my experience in a video using Loom.

    Big data refers to very large sets of data that are difficult to process using traditional methods. This data can be organized in different ways, like spreadsheets or social media posts, and comes in a lot of different formats. The main characteristics of big data are that there's a lot of it, it's generated quickly, and it can be in many different forms. To handle big data, special tools, and technologies have been created to store, process, and understand it. Using big data can help businesses and organizations make better decisions by analyzing patterns and trends. Where Data analytics means looking at big sets of data to find useful information and make smart decisions. This involves different ways of organizing and understanding the data, like identifying patterns or trends. Data analytics uses different tools and technology to help people make sense of the data, like software, algorithms, and charts. People use data analytics in many areas, like business and healthcare, to make better decisions, improve things, and find new opportunities.

Comparing public data sources

The World Bank Open Data is a free platform that provides access to a wide range of development data, including over 10,000 indicators and datasets across different topics such as poverty, education, health, and the environment. The platform promotes transparency and accountability in development efforts by providing free and open access to the World Bank's vast collection of development data. The platform is widely used by researchers, policymakers, journalists, and development practitioners to inform decision-making, monitor progress on development goals, and advance research on global issues.

The US Government Open Data is a platform that provides public access to government data from federal, state, and local agencies. The platform promotes transparency, accountability, and innovation by releasing data in machine-readable formats and APIs, making it easier for citizens, businesses, and researchers to analyze and reuse the data. The platform is widely used for research, policymaking, and civic engagement.

Google Dataset Search is a search engine that helps users find datasets on various topics from different sources. It searches for datasets across different domains and indexes datasets from academic publishers, data repositories, and government websites. The search results include information on the dataset, citation information, and related datasets. It simplifies the process of finding and accessing datasets by providing a single platform where users can easily search for relevant datasets from different sources.

Kaggle is an online platform for data science competitions, machine learning projects, and data exploration. It provides a community of data scientists and machine learning enthusiasts with access to datasets, tools, and resources to solve real-world problems using data-driven approaches. Kaggle hosts a wide range of data science competitions, ranging from predicting house prices to developing algorithms that can diagnose medical conditions. These competitions are sponsored by companies, organizations, and government agencies that are interested in finding solutions to data-related problems. Kaggle is a great resource for people interested in data science, machine learning, and artificial intelligence, and it's a platform where they can practice their skills, learn new techniques, and collaborate with others in the field.

No-Code Data Analytics Tools

Google Data Studio is a tool that helps people create easy-to-understand reports and charts using data from different sources. It has many options to make the reports look good and it lets people work together to create them. The reports can be shared with others and updated in real time. It works with other Google tools, like Google Sheets and Google Drive, to make it easier to use.

Power BI is a tool that helps businesses understand their data by creating visualizations like charts and graphs. It connects to different types of data sources, like Excel and databases, and can find hidden patterns and trends. Power BI can be used on a computer, in the cloud, or on a mobile device. It helps people make better decisions by understanding their data.

Tableau Public is a tool that lets you create cool and interactive charts and graphs for free. You can connect it to different types of data, like spreadsheets or databases, and make the data come alive by creating pictures and maps. You can also put these pictures and maps together to make a story, called a dashboard, and share it with other people. Tableau Public has a big community of people who share their work, so you can find new ideas and learn from others. 

Power BI - 24 February 2023 - Watch Video

Data Set Example

Using the World Bank Data Set - Health - Life expectancy at birth, male (years) https://data.worldbank.org/indicator/SP.DYN.LE00.MA.IN?view=chart

I then used the data from Colombia as this is where my parents are born and raised, and I was raised for half of my young life. https://data.worldbank.org/indicator/SP.DYN.LE00.MA.IN?view=chart&locations=CO

Viewing and analyzing this data would is giving me some valuable information. Not that is going to predict my future as there are many variables that assist in living a longer life.

I will be using the Power BI Data Analytical Tool for this project as I found it similarly user-friendly as Google Data Search and still wanted to challenge myself in learning new Microsoft applications.

Chat GPT Analysis

This is a table of data from a source called "World Development Indicators". It was last updated on December 22, 2022. The data is about the country of Colombia and shows the life expectancy of males from 1960 to 2021. Each year has a number that represents the average number of years a male in Colombia is expected to live. For example, in 1960 the average male life expectancy was 55.099 years, while in 2021 it was 71.537 years. This table can be used to see how male life expectancy has changed over time in Colombia.

Analyzing further in ChatGPT

Looking at the data for male life expectancy in Colombia from 1960 to 2021, we can see that there has been a general trend of improvement over time. In 1960, the average male life expectancy was 55.099 years, while in 2021 it had increased to 71.537 years, which represents an increase of over 16 years.

However, there are also some fluctuations in the data over time. For example, in the mid-1990s, there was a dip in male life expectancy that lasted for a few years before recovering in the late 1990s. It's worth noting that the year 2020 has a missing value, which could be due to the COVID-19 pandemic and the impact it had on data collection and reporting.

Overall, this data suggests that male life expectancy in Colombia has generally been improving over time, but there have also been some ups and downs along the way. It would be interesting to compare this data to similar data for other countries in the region or around the world to get a better sense of how Colombia's performance compares to that of other countries.

Predicting in ChatGPT

While we can use historical data to make some general predictions about future trends, it's important to keep in mind that there are many factors that can influence life expectancy and make it difficult to predict future outcomes with certainty. Some of these factors include changes in healthcare access and quality, improvements or declines in economic conditions, changes in social norms and lifestyle choices, and unexpected events like pandemics or natural disasters.

That being said, based on the historical data presented, we could make a general prediction that male life expectancy in Colombia will likely continue to increase over time, assuming that current trends and conditions persist. However, it's important to note that this prediction is subject to a lot of uncertainty and there could be many different scenarios that could impact the trajectory of male life expectancy in Colombia.

My Emotions

Before embarking on this project, I was filled with a sense of anticipation and excitement, eager to delve into uncharted territory and acquire fresh knowledge. This class has been an amazing experience, feeling a great sense of pride and joy and has sparked a lot of thinking and pride that an old dog can learn new tricks, and I've had the privilege of immersing myself in cutting-edge technology that I'm looking forward to employing in future courses. As I worked on the assignment, I was thoroughly engaged and thrilled to be discovering novel tools. I found Power BI to be remarkably user-friendly, and witnessing my data come to life in the dashboard was a truly gratifying experience. Upon completing the task, a weight was lifted from my shoulders, and I felt a sense of satisfaction and elation at having wrapped up this week's coursework. Reflecting on my utilization of ChatGPT assistance, I must confess that I thoroughly enjoyed employing this technology. It's truly awe-inspiring how artificial intelligence can transform a few words into a beautifully crafted paragraph brimming with information. ChatGPT proved to be exceedingly helpful, and I'm grateful for the invaluable assistance it offered.

Prompts Used with ChatGPT

Please explain the basics of big data?

    -Re-write using simpler terms

explain the basics of data analytics?

    -rewrite using simpler terms

what is the World Bank Open Data?

What is the US Government Open Data?

what is the World Bank Open Data?

what is Google Dataset Search?

What is Kaggle?

What is Google Data Studio?

What is Power BI?

What is Tableau Public?

    -All of these, I told ChatGPT to rewrite in a shorter version or simpler terms

morenor@usf.edu

Data Source World Development Indicators

Last Updated Date 12/22/22

Country Name Country Code Indicator Name Indicator Code 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971         1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983         1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995         1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007         2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019         2020 2021

Colombia COL Life expectancy at birth, male (years) SP.DYN.LE00.MA.IN 55.099 55.714 56.306 56.919 57.385 57.826 58.214 58.529 58.84 59.147 59.459 59.913 60.383 60.806 61.354 61.913 62.415 62.921 63.373 63.749 64.174 64.451 64.75 64.901 65.024 63.477 64.975 64.756 64.605 64.576 64.539 64.652 64.86 65.221 65.533 65.885 66.13 66.431 66.662 66.884 67.312 67.545 68.076 68.622 69.032 69.513 69.961 70.374 70.933 71.487 71.817 72.141 72.455 72.725 72.985 73.252 73.519 73.759 73.842 73.8 71.537

Can you analyze this data further?

can you predict further based on this data?

In Conclusion

During the course of this week, we had the opportunity to explore various data and data analysis tools. Personally, I relished the chance to delve deeper into the World Data Bank and conduct research on male life expectancy. My understanding of Power BI was enhanced, and I'm eager to delve further into this application in the coming days for upcoming projects. I found this newfound knowledge to be highly enriching, and I'm excited to continue expanding my understanding of technological advancements. In the upcoming week, I'll be utilizing Power BI to gain deeper insights into how I can apply it to my professional pursuits.

Comments

  1. Hi Ryan, great job on your blog. You did a phenomenal job creating a thorough blog explaining each concept, and it was very informative. Your blog is organized very nicely, with the bolded topics titles that stand out from the rest of the text. I look forward to reading your blog next week!

    ReplyDelete

Post a Comment