Tableau

Intro to Data Visualization & Tableau

  • Visualization
    • Way to represent data that is visually appealing and interactive
    • Way to graphically represent data in the form of charts
  • BI tool / Data Visualization tool
    • Representing data to gain or generate more compelling insights
    • Easy to understand
    • Using Data intelligence to drive your business
  • Why is tableau called BI tool
    • Based on data tableau can bring insights by representations
Tableau
  • Data Visualization tool
  • Using data to drive decisions
  • Other visualization tools
    • Power BI
    • Python
    • Excel
  • Dashboard
    • Collection of diff data summary to provide an entire story
  • Why Tableau
    • No coding knowledge
    • Connect to multiple data sources with ease
      • excel
      • mysql
      • aws
      • google sheet
      • twitter
      • APIs
    • Interactability
    • Data refresh
    • Deployment to Tableau server
Visualization requires
  • Import file
  • preprocessing to clean data
  • manipulate the data
  • summarize the data
  • comeup with some insights from data
  • Create visualizations
What is required to do Analysis
  • What is data about 
  • Shape of data
    • #rows & #cols
  • Attributes/characteristics of Data
  • Level of data -> Granularity(Slicing to smallest unit)


https://docs.google.com/spreadsheets/d/1sM1Kjgy99k2tHBtKP5OXcpQLz7SC5yGr/edit#gid=432276125
Download file to Tableau
  • File -> Download -> xlsx
  • Data Visualization
    • Visualizing, summarizing, aggregating to get insights from it
  • Data scientist
    • Understanding problem
    • Framing a solution(solution approach)
    • What is insight based on solution approach
    • What is the recommendation based on insight 
  • Categorical Variables vs Numerical Variables
  • Dimensions
    • Qualitative & Categorical
    • It is how we want to split the meature
  • Measures
    • Quantitative information to be measured
    • usually numerical
  • Discreet vs Continuous
    • Discreet -> Finite(quantity, #students)
    • Continuous -> infinite(Temperature, rainfall)
  • Data Types
  • Tableau public vs Tableau desktop
    • Higher security, rowlevel security available in desktop
    • only 1M rows in Tableau public where as unlimited rows in desktop
    • Higher connectivity options in desktop
  • Seasonality
    • It is when your measure shows certain pattern during particular times of the year periodically. 
  • GeoChart
    • Heat map (geo graphical heat map)
Data Structuring Options
  • Bar Chart -> Categorical variable in X-Axis and Numerical variable in Y-axis
  • Line Chart -> Date/time trend on X-axis and Numerical variable in Y-axis
  • Pie Chart -> % of Total
    • Label -> Quick Table Calculation -> % of Total
  • Find Number of Orders based on Number of Quantities sold
    • Histogram
      • X-axis are binned values(eg 1-5, 6-10..)
      • Y-axis has frequency
      • Quantity -> right click -> create -> bins
      • Quantity(Dim) -> X axis
      • Order(count) -> Y axis OR Drag OrderId to rows & mark it as measure(count) instead of dimension
  • When table converted/pivoted from wide to Long format the level/granularity will change
  • Use Data interpreter -> to remove junk rows
  • To convert wide to long -> select columns -> pivot -> columns will be rows value along side with the column value -> change the newly added pivot column names
  • Data Source Filter
    • Filter -> add -> select column -> uncheck include null values -> 
  • Text Table
    • Worksheet -> duplicate as crosstab -> New sheet will get created -> open it
  • Stacked Bar Chart
    • Say category in Columns and Sales in Rows
    • if we want to breakup sales by region -> Drag region in Color -> creates Stacked Bar Chart
  • Highlight Table(Multiple Dimensions & one Measure)
    • Display sales of each subcategory by region for all Years in table format
      • Region to rows
      • Year, subcategory to columns
      • Sales to text
      • Drag sales to Color  #lighter the color - low in value
      • Automatic text to square
  • HeatMap(Multiple Dimensions & upto 2 measures)
    • Highlight table will give same color for both measure if we add 2nd measure
    • How to differenciate one measure by Color another by Size
  • Analytical problems
    • Say you added 3 new products in store. 
    • To compare the new products perf with existing products
    • Group the new products and remaining products
    • Groups
      • Right click on column(subcategory) -> create -> Group -> select and name them
  • Quick Filter
    • Add category to Filter panel -> right click -> show Filter
      • This enables user to filter on the fly
  • SETS
    • Top 5 customers with highest profit
      • order by profit -> select top5 -> create SET
      • Drop set to color
  • Publish the Dashboard
    • File -> save to Tablaeu public as
    • setting -> show sheets as tab
Filters and calculations

Top 5 subcategories by sales
    • Filters -> top 5
    • show all categories and differentiate
      • sub-category -> rightclick -> create -> set-> TOP  -> choose the values -> 
  • Identify top customer with sales but with low profit margin
    • Two sets -> Top 100 customers by sales & bottom 100 by markgin/profit
    • Customer name -> create SET -> TOP CUSTOMERS BY SALES
    • Customer name -> create SET -> Bottom CUSTOMERS BY SALES
    • Drag customer, and 2 sets to rows
    • Also, add sets to filters
    • Combined Set
      • Select Both the sets -> right click -> create combined set
      • Drag the combined set to Filter
  • Sets vs groups
    • Groups is static vs SETS can be dynamic
    • SETs have only 2 values (in/out) where as groups can be many
    • SETS can be created only on dimensions, groups can be on measues and dimentions.
  • Parameters
    • Top 100 Customer by Sales
      • Create Parameter from dropdown next to Search 
      • Show parameter
      • Modify the set to refer to Parameter
  • Hierarchy
    • Drag state and place on Country -> creates Hierarchy element
    • add other fields like city
  • Types of Filters
    • Extract Filter
      • Filter that is applied on Extract itself(on initial data Extract)
    • Data Source Filter
    • Context Filter
      • Top 10 furniture subcategories

    • Dimensions filter
    • Measure filter
    • Table Calculation Filter

Comments

Popular posts from this blog

LangChain

AutoGen