안녕하세요 늑대양입니다 :)
가짜연구소(Pseudo Lab)에서 진행하는 아카데미 프로그램 5기 러너로 선정되어 해당 교육 내용을 블로깅하고자 합니다.
가짜연구소 main URL: https://pseudo-lab.com/
프로그램: Data Scientist - Python - 황지*님/이다*님
메인 학습 material: Datacamp
- Datacamp main URL: https://www.datacamp.com/
1주차 학습 범위: Understanding Data Science
Understanding data science:
What is data science?
Making data work for you!
What can data do?
- Describe the cuurent state of an organization or process.
- Detect anomalous events.
- Diagnose the causes of events and behaviors.
- Predict future events.
Why now?
Data is everywhere!
The data science workflow:
- Data collection & storage
- Data perparation
- Exploration & visualization
- Experimentation & prediction
Let's practice!
Applications of data science:
More case studies
- Traditional machine learning
- Internet of Things(IoT)
- Deep learning
Case study: fraud detection
What do we need for machine learning?
- A well-defined question
- "What is the probability that this transation is fraudulent?"
- A set of example data
- Old transactions labeled as "fraudulent" or "valid"
- A new set of data to use our algorithm on
- New credit card transactions
Case study: smart watch
Internet of Things(IoT)
Refers to gadgets that aren't standard computers
- Smart watches
- Internet-connected home security systems
- Electronic toll collection systems
- Building energy management systems
- Much, much more!
Case study: image recogniton
Deep learning
- Many neurons work together
- Requires much more training data
- Used in complex problems
- Image classification
- Language learning/understanding
Let's practice!
Data science roles and tools:
- Data Engineer
- Data Analyst
- Data Scientist
- Machine Learning Scientist
Data engineer:
- Information architects
- Build data pipelines and storage solutions
- Maintain data access
- Main part: Data collection & storage
Data engineering tools:
- SQL: To store and organize data
- Java, Scala or Python: Programming languages to process data
- Shell: Command line to automate and run tasks
- Cloud computing: AWS, azure, GCP
Data analyst:
- Perform simpler analyses that describe data
- Create reports and dashboards to summarize data
- Clean data for analysis
- Main part: Data preparaton, Exploration & Visualization
Data analyst tools:
- SQL: Retrieve and aggregate data
- Spreadsheets(Excel or google sheets): simple analysis
- BI tools(Tableau, Power BI, Looker): Dashboards and visualizations
- May have Python or R: Clean and analyze data
Data scientist:
- Versed in statistical methods
- Run experiments and analyses for insights
- Use traditional machine learning
- Main part: Data preparation, Exploration & Visualization, Experimentation & Prediction
Data scientist tools:
- SQL: Retrieve and aggregate data
- Python and/or R: Data science libraries, e.g., pandas(python) and tidyverse(R)
Machine learning scientist:
- Predictions and extrapolations
- Classification
- Deep learning
- Image processing
- NLP
- Main part: Data preparation, Exploration & Visualization, *Experimentation & Prediction(Focus)
Machine learning tools:
- Python and/or R: Machine learning libraries, e.g., TensorFlow or Spark
Let's practice!
1주차 학습 주제: Introduction to Python
1주차 학습 material URL: https://app.datacamp.com/learn/courses/intro-to-python-for-data-science
Course Description:
Python is a general-purpose programming language that is becoming ever more popular for data science. Companies worldwide are using Python to harvest insights from their data and gain a competitive edge. Unlike other Python tutorials, this course focuses on Python specifically for data science. In our Introduction to Python course, you’ll learn about powerful ways to store and manipulate data, and helpful data science tools to begin conducting your own analyses. Start DataCamp’s online Python curriculum now.
Index:
- Python Basics
- Python Lists
- Functions and Packages
- NumPy
'Study > 가짜연구소: Data Scientist - Python' 카테고리의 다른 글
[가짜연구소] Data Scientist - Python 6주차. (0) | 2022.10.22 |
---|---|
[가짜연구소] Data Scientist - Python 5주차. (2) | 2022.10.19 |
[가짜연구소] Data Scientist - Python 4주차. (0) | 2022.10.09 |
[가짜연구소] Data Scientist - Python 3주차. (0) | 2022.10.02 |
[가짜연구소] Data Scientist - Python 2주차. (0) | 2022.09.24 |