Tableau
Intro to Data Visualization & Tableau
- Visualization
- Way to represent data that is visually appealing and interactive
- Way to graphically represent data in the form of charts
- BI tool / Data Visualization tool
- Representing data to gain or generate more compelling insights
- Easy to understand
- Using Data intelligence to drive your business
- Why is tableau called BI tool
- Based on data tableau can bring insights by representations
- Data Visualization tool
- Using data to drive decisions
- Other visualization tools
- Power BI
- Python
- Excel
- Dashboard
- Collection of diff data summary to provide an entire story
- Why Tableau
- No coding knowledge
- Connect to multiple data sources with ease
- excel
- mysql
- aws
- google sheet
- APIs
- Interactability
- Data refresh
- Deployment to Tableau server
Visualization requires
- Import file
- preprocessing to clean data
- manipulate the data
- summarize the data
- comeup with some insights from data
- Create visualizations
What is required to do Analysis
- What is data about
- Shape of data
- #rows & #cols
- Attributes/characteristics of Data
- Level of data -> Granularity(Slicing to smallest unit)
https://docs.google.com/spreadsheets/d/1sM1Kjgy99k2tHBtKP5OXcpQLz7SC5yGr/edit#gid=432276125
Download file to Tableau
- File -> Download -> xlsx
- Data Visualization
- Visualizing, summarizing, aggregating to get insights from it
- Data scientist
- Understanding problem
- Framing a solution(solution approach)
- What is insight based on solution approach
- What is the recommendation based on insight
- Categorical Variables vs Numerical Variables
- Dimensions
- Qualitative & Categorical
- It is how we want to split the meature
- Measures
- Quantitative information to be measured
- usually numerical
- Discreet vs Continuous
- Discreet -> Finite(quantity, #students)
- Continuous -> infinite(Temperature, rainfall)
- Data Types
- Tableau public vs Tableau desktop
- Higher security, rowlevel security available in desktop
- only 1M rows in Tableau public where as unlimited rows in desktop
- Higher connectivity options in desktop
- Seasonality
- It is when your measure shows certain pattern during particular times of the year periodically.
- GeoChart
- Heat map (geo graphical heat map)
- Bar Chart -> Categorical variable in X-Axis and Numerical variable in Y-axis
- Line Chart -> Date/time trend on X-axis and Numerical variable in Y-axis
- Pie Chart -> % of Total
- Label -> Quick Table Calculation -> % of Total
- Find Number of Orders based on Number of Quantities sold
- Histogram
- X-axis are binned values(eg 1-5, 6-10..)
- Y-axis has frequency
- Quantity -> right click -> create -> bins
- Quantity(Dim) -> X axis
- Order(count) -> Y axis OR Drag OrderId to rows & mark it as measure(count) instead of dimension
- When table converted/pivoted from wide to Long format the level/granularity will change
- Use Data interpreter -> to remove junk rows
- To convert wide to long -> select columns -> pivot -> columns will be rows value along side with the column value -> change the newly added pivot column names
- Data Source Filter
- Filter -> add -> select column -> uncheck include null values ->
- Text Table
- Worksheet -> duplicate as crosstab -> New sheet will get created -> open it
- Stacked Bar Chart
- Say category in Columns and Sales in Rows
- if we want to breakup sales by region -> Drag region in Color -> creates Stacked Bar Chart
- Highlight Table(Multiple Dimensions & one Measure)
- Display sales of each subcategory by region for all Years in table format
- Region to rows
- Year, subcategory to columns
- Sales to text
- Drag sales to Color #lighter the color - low in value
- Automatic text to square
- HeatMap(Multiple Dimensions & upto 2 measures)
- Highlight table will give same color for both measure if we add 2nd measure
- How to differenciate one measure by Color another by Size
- Analytical problems
- Say you added 3 new products in store.
- To compare the new products perf with existing products
- Group the new products and remaining products
- Groups
- Right click on column(subcategory) -> create -> Group -> select and name them
- Quick Filter
- Add category to Filter panel -> right click -> show Filter
- This enables user to filter on the fly
- SETS
- Top 5 customers with highest profit
- order by profit -> select top5 -> create SET
- Drop set to color
- Publish the Dashboard
- File -> save to Tablaeu public as
- setting -> show sheets as tab
Top 5 subcategories by sales
- Filters -> top 5
- show all categories and differentiate
- sub-category -> rightclick -> create -> set-> TOP -> choose the values ->
- Identify top customer with sales but with low profit margin
- Two sets -> Top 100 customers by sales & bottom 100 by markgin/profit
- Customer name -> create SET -> TOP CUSTOMERS BY SALES
- Customer name -> create SET -> Bottom CUSTOMERS BY SALES
- Drag customer, and 2 sets to rows
- Also, add sets to filters
- Combined Set
- Select Both the sets -> right click -> create combined set
- Drag the combined set to Filter
- Sets vs groups
- Groups is static vs SETS can be dynamic
- SETs have only 2 values (in/out) where as groups can be many
- SETS can be created only on dimensions, groups can be on measues and dimentions.
- Parameters
- Top 100 Customer by Sales
- Create Parameter from dropdown next to Search
- Show parameter
- Modify the set to refer to Parameter
- Hierarchy
- Drag state and place on Country -> creates Hierarchy element
- add other fields like city
- Types of Filters
- Extract Filter
- Filter that is applied on Extract itself(on initial data Extract)
- Data Source Filter
- Context Filter
- Top 10 furniture subcategories
- Dimensions filter
- Measure filter
- Table Calculation Filter
Comments
Post a Comment