Gen AI

 Introduction  - Core Skills And Curriculum >> Other Classes >> Additional Live Content

  • Landscape of GenAI
    • https://explodingtopics.com/blog/chatgpt-users
    • Bard from google
  • Working of GenAI
  • Impact of GenAI
Technology Stack
  • Models & APIs 
    • Closed LLM  -> not open source / not free
      • Palm2 (from google)
      • LLama(facebook)
      • Claude
      • Cohere
    • Open LLM
      • OpenLLama
      • HuggingChat
      • Dolly
      • StableLM
    • Image Models
      • midjourney
      • dall e2
    • Music Model
      • MusicLM
      • MusicGem
    • Databases
      • Pinecone ->Vector DB used by Grammarly 
      • Chroma
    • LLM Frameworks -> to create chain of LLM tasks
      • Langchain 
      • LLamaindex
    • Deployment
      • HuggingFace
      • Docker
      • Vertexai
  • AI
    • ML
      • DL
        • NLP
          • GEN AI

Pre-requisites
  • Python Library
  • Machine Learning 
    • Used for Tabular/structured data
    • Function to predict near to real value
      • This requires Probability/stats, Hypothesis, Maths for ML
    • Astrologer
    • Mathematical function to predict a value is called Model
    • Tabular data
    • Its all about finding mathematical function
  • Deep Learning
    • Predictions on unstructured data
      • Images, Video, Audio
      • It can be used in tabular data
      • It will be a mathematical function(it is more complex than ML function)
      • Neural network algorithm
  • Neural Networks
    • Object detection
  • NLP(Natural Language Processing)
    • Text Data
  • Generative AI  (Attention is all you need - paper on genai)
    • Generate text, audio, image, videos
    • Extension to NLP world
  • https://platform.openai.com/
    • https://platform.openai.com/tokenizer
    • https://platform.openai.com/settings/organization/billing/preferences
    • https://platform.openai.com/api-keys
    • sk-proj-vd72cpnP12dc6K8Blt_gAjTARMepqf0yIVL4JVWuSIAwva0zSOVyVvw40vn1RdCh3XgDT7G9wqGFT3BlbkFJIxXSoPC5De4nNOksGypiTpEwXZ8NH7Rz0z1oPnBv33TS18bev1QRidTXXd9xdZGM5twb4_WWUA
    • curl https://api.openai.com/v1/chat/completions \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer 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" \
        -d '{
          "model": "gpt-4o-mini",
          "store": true,
          "messages": [
            {"role": "user", "content": "write a haiku about ai"}
          ]
        }'
  • !pip install openAI
  • from google.colab import userdata
    openai.api_key = userdata.get('vdkey')
  • import openai
  • openai.api_key = "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"
  • messages=[
        {"role":"system","content":"You are helpful assistant."},
        {"role":"user","content":"who won IPL 2020."}
    ]
  • chat_response = openai.chat.completions.create(
        model="gpt-3.5-turbo-16k",
        messages=messages
    )
  • chat_response
  • chat_response.choices[0].message.content
  • prompt = '''you are helpful neural network teaching assistant. Explain the various optimisation methods in Neural Network. Provide an exhaustive summary of the methods describing what they do, sample code for each, and guidelines on when to use each method. '''
  • message=[  {"role":"user","content":prompt}]
  • chat_response = openai.chat.completions.create(
        model="gpt-3.5-turbo-16k",
        messages=message, max_tokens=200, temperature=0.5, n=1, stop=None, frequency_penalty=0, presence_penalty=0)
  • chat_response.choices[0].message.content
  • messages=[
        {"role":"system","content":"You are helpful assistant designed to output JSON to the key 'answer'."},
        {"role":"user","content":"who won IPL 2020."}
    ]

  • !pip install openAI
    import openai
    from google.colab import userdata
    openai.api_key=userdata.get('vdkey')
    base_prompt = '''you are helpful neural network teaching assistant. Explain the various optimisation methods in Neural Network. Provide an exhaustive summary of the methods describing what they do, sample code for each, and guidelines on when to use each method. The topic is: {0}'''
    topic_name="Auto Encoders"
    prompt1=base_prompt.format(topic_name)
    message=[{"role":"user","content":prompt1}]
    chat_response = openai.chat.completions.create(
        model="gpt-3.5-turbo-16k",
        messages=message,
        max_tokens=200,
        temperature=0.5,
        n=1,
        stop=None,
        frequency_penalty=0,
        presence_penalty=0) chat_response.choices[0].message.content


    • Mini usecases
      • Math tutor
      • support tickets
    !pip install openAI

    import openai
    from google.colab import userdata
    openai.api_key=userdata.get('vdkey')
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
    filepath='/content/drive/MyDrive/GenAI/'
    !ls '/content/drive/MyDrive/GenAI/'
    with open(filepath+"Q1FY24 Earnings Call Transcript.txt","r") as f:
      transcript = ' '.join(f.readlines())

    len(transcript)
    base_instruction = '''you are a helpful assistant which help financial analysts retrieve relavant financial
    and business related information from documents. Given below is the question and transcript of an earnings
    call of an automobile company, Asian paints, which was attended by the top management of the firm. Try to
    respond with specific numbers and facts whereever possible. If you are not sure about the accuracy of the
    information, just respond that you do not know'''
    question="How much asian paints business have grown?"
    question="summarise the key financial metrics reported in the earnings call related to revenue growth, profitability, cash flow and debt."

    prompt=base_instruction+"\n\n"+"Question: {0}".format(question)+"\n\n"+"Transcript: \n {0}".format(transcript)

    message=[{"role":"user","content":prompt}]
    print(message)
    chat_response = openai.chat.completions.create(
        model="gpt-3.5-turbo-16k",
        messages=message,
        max_tokens=200,
        temperature=0.5,
        n=1,
        stop=None,
        frequency_penalty=0,
        presence_penalty=0)
    print(chat_response.choices[0].message.content)

    #Few shot prompting
    !pip install openAI
    import openai
    from google.colab import userdata
    openai.api_key=userdata.get('vdkey')
    chat_response = openai.chat.completions.create(
        model="gpt-3.5-turbo-16k",
        messages=[
            {"role":"system","content":"you are an ai tutor that assists school students with math homework problems."},
            {"role":"user","content":"help me solve the equation 2x-10=20."},
            {"role":"assistant","content":"try moving the 10 to the rightside of the equation. what do you get."},
            {"role":"user","content":"x + 10 = 15"}
        ]
        )

    print(chat_response.choices[0].message.content)

    --
    with open(filepath + "AI_tutor_system_message_1.txt","r") as f: system_message=' '.join(f.readlines()) print(system_message); message_history=[ {"role":"system","content":system_message}, {"role":"user","content":"help me solve the equation 3x-9=21"}, {"role":"assistant","content":"Sure! Try moving the 9 to the right hand side of the equation. what do you get."}, {"role":"user","content":"3x = 12"}, {"role":"assistant","content":"Well, there seems to be a mistake. When you move 9 to the right hand side, you need to change its sign. Can you try again?"}, {"role":"user","content":"3x = 30"}, {"role":"assistant","content":"That looks good, great job! Now, try to divide both sides by 3. What do you get?"}, {"role":"user","content":"x = 10"},
    {"role":"user","content":"help me solve the equation x-10=2x"}, ]
    --
    max_conversations = 20
    conversation_length=0



    • Using the example, complete the request
      Example:
      Question: What is the most popular language for data analysis?
      Answer: Python

      Request:
      Question: What is the most popular data file format for data analysis?
      Answer: 
    • You are an experienced executive vice president in Software Development. What do you look in Quarterly Business Review from Product Sustenance team
    http://thonny.org/
    Streamlit and Python for graphs


    • Load Balancer between user and actual app servers
    • API servers
    • Lower write & heavy read
      • Read replicas using master/slave architecture
      • Cache with TTL
      • CDN servers(Akamai) -> for static content like images
    • Engineering interview
      • Clarifying qns wrt Functional requirements
      • Non-Functional requirements
      • Maths estimate
      • Data flow /  Api Design
      • High level design
      • Detail design
    • AI use cases
      • functional requirement
      • convert req to AI requirement
      • Data source and review 
      • Model development
      • Evaluate model
      • Deploy and serving
      • Monitoring
    • 1 Million => 1MB
    • 1 Billion  => 1 GB
    • How to find users logged in but not actively on appln
      • Heart Beats
        • from user to Server - every 5 seconds
        • Redis stores data in <key, value> pair. It reads data from Memory
          • <key, value, ttl>

      • Connection Pooling in DB
      • Sharding when one DB system cannot hold all data
      • ngnix - load balancer
    • Monolith & Microservice
      • Any thing which do not be required to be real time should not be in real time eg: payment confirmation
        • Use queueing mechanism for such uses cases
      • API Server
        • Specialized in sending request to a micro service to be sent
        • rate limitting
        • secure
    • BookMyShow
      • User should be able to view event  -> Availability is priority
      • book event for the show   -> consistency is priority
      • Search any event   -> Availability is priority
      • Out of scope
        • User Registration
        • View my orders
        • Admin adding an event
        • Dynamic pricing
      • Non-Functional Requirements
        • Consistency and Availability(CAP Theorem)
        • low latency
        • read heavy (100:1)
        • Out of scope
          • Regular backups
          • ci/cd pipelines
          • secure trans for purchase
          • GDPR -> Cannot keep European data outside Europe
      • Core Entities
        • Events
        • Users
        • Venues
        • Performers
        • Tickets
        • Bookings
      • High level design
        • User -> api gateway(auth, rate_limit, route to intended microservice) -> Event Service -> DB
    • Diff between Jr and Sr engineers
      • Ability to wear Business Hat
      • Product Hat
      • Engineering Hat
    • Low latency
      • sql query 
        • Slow for queries like '%xyz%'
      • Elastic Search
        • fast for search operations
      • Redis or Mem Cache for recent events. This will be between service and SQL db
      • SQL DB sync with Elastic search using CDC queue
    • Scaling
      • DB - Read Heavy
        • have replicas
        • It will have eventual consistency
        • Read your own write -> to overcome eventual consistency for the post owner
      • DB - Write Heavy
        • Range/Hash Partition data by some discriminator
        • Have read replicas for each write node
        • There is chance of hot spots in Range partitions
        • Hash can distribute is evenly
        • Monitor system for hot spots and spawn a new machine for such cases automatically
      • Auto Scale
        • Use Queuing system for non-functional use cases like LIKEs, subscriber counts
          • Async
          • deligate
      • SQS - simple queuing system
        • Only one subscriber for an event
      • Kafka
        • consumer set for where one set for counting and other for recommendation
      • Videos
        • presigned URL/preauthenticated url
        • Directly upload to preauthenticated/presigned url

    • Flipkart/Amazon
      • Data pattern
        • Most recent orders : read/write
        • 3-6 months: freq read
        • very old: in frequent read(for audit purposes only)
      • Hot Store DB
        • most recent read/write
        • very fast
        • expensive
        • strong consistency
        • transactional
        • read/write
      • Warm Store
        • read only
        • no transactions
        • lesser expensive
        • slightly slower
      • Cold Store
        • infrequent reads, cheaper, slow
        • for compliance and audit
        • for analytics

    • Run Keeper
      • Functional Spec
        • User should be able to start, stop, pause and save their run or ride
        • While running(bicycling) -> view activity data, route, distance and time
        • News feed -> ur own activity or friend activity
        • Post mvp
          • comment or like
          • authentication/authorization
          • friend management
          • Music
        • Non Functional
          • Availability over consistency
          • 10 Million concurrent users
          • Stats accuracy in real time
          • offline internet cases
        • DB design
          • Core Entities
            • User
            • Activities
            • Route
            • Friend
        • Heversine distance
        • 10M concurrent activities sending gps details every 3 secs - which will be around 500 TB/year
          • sharding
            • by time/day/week  - as users see their recent activity
            • cache/hot/warm/cold storage architecture as well
        • sending location every few seconds
          • use local device memory/db
        • Leaderboard
          • Redis - sorted set

    • EBay auction
      • Post an item for auction with starting price and end date
      • Users must be able to bid on that item with higher bidding price
      • Users much be able to view auction
      • Out of scope
        • Seaching
        • history of auctions
      • Non functional requirements
        • high consistency
        • Fault tolerance & Durability
        • 10M concurrent users - Scalability
        • current highest bid
      • Core entities
        • Auction
        • Users
        • bids
        • Items
      • APIs
        • post /auctions
        • post /auctions/:id/biddings
        • get /auctions/:id
      • High Level Design
        • User -> api gateway(routing, authentication, rate limit)  -> api server -> db
      • locking
        • redis
        • It is single threaded and no chance of concurrency issues
      • 2 phase commit
      • Optimistic commit

    • Instagram's feed ranking model
      • Rank post/reels in a users feed to maximize engagement(like/shares/comment)
      • Diversity in content(diff creator, genre)
      • Freshness in it(recent posts)
    • Business Perspective
      • Focus on suggested post (new creators should be highlighted) - tangently connected
      • Higher level - Daily active users increase - Click thru rate will increase - avg session time increase
      • Individual level - relevant content
    • Functional Requirements
      • Improve DAV, avg time spent, CTR
      • user engagement(like, share, comment, following, hashtags)
    • Non-Functional 
      • Scalable
      • available
      • latency < 100ms
      • monetization(ad revenue)
      • tooling(debugging, monitoring, alert & warning)
      • Analytics (content, creator demographics)
    • Estimation
      • 500 million daily active users
      • Storage 
        • Structured data (users, follows, post meta data)
        • Data lakes (s3/HDFS) for capturing some interaction records
    •  Curating posts to show
      • identify 100 relevant posts in 1bn posts (say)
      • ranking
      • post processing
    • Important Data/features
      • Posts(Creators, audio/video/text embedding) - embedding is vector of certain size that captures interaction and relation
      • Viewers
        • Aggregated views (likes in last 7 days, delayed features, likes that were there before 14 days before)
        • Viewers 
    • Model generation
      • candidate generation (most relevant 100 posts from 1M)
      • rank based on probability of engagement they would entail
    • Deep dive
      • build a machinary that would allow us to learn the embeddings
    • Collaborative filtering
    • Two tower network


    Zomato
    • Functional requirements
      • Accurate ETA (real time prediction is optional)
      • Dynamic Updates -> travel time, restaurant prep time, order complexity, rider availability
      • restaurant and rider integration
    • non-Functional requirements
      • Scalability
      • low latency
      • high availability
      • Data Security & Privacy
    • Data Sources and Structures
      • Order data (orderId, restauranteId, paymentInfo, deliveryAddress, special instruction)
      • Restaurant Data (RestaurantId, name address, location(lat,long), prep time), operational time, rating, 
      • Rider data
      • Customer data
      • Traffic data (Road network data, speed, congestion, historic traffic pattern, weather conditions)
    • Storage
      • Raw data lake(unstructured data on amazon s3)
      • Structured data - aws redshift
      • frequently accessed  -> dynamo db
      • meta data store - awt glue data catalog
    • Data Processing & feature engineering
      • ETL
    • feature engineering
      • distances - harvensine distrance
      • time based features
      • restaurant features
      • ride specific features
    • feature encoding
      • one hot
      • target
      • ordinal
    • sagemaker(training, inteference, end point generation)


    Airline ticket shopping
    • Airlines/travel agencies/analysis require real time data to optimize prizing, demand forecasting, market trends
    • Dynamic Price Optimization
      • Demand surge, competition, User search pattern
    • Personalized recommendations
      • tailor made flight options
    • Demand forecasting
      • supply chain, airport timings
    • Market trend analysis
      • Identify popular routes
      • peak travel time
      • emerging travel destinations
    • Technical problems
      • Ingesting - High volume - 10 TB - continuously streamed
      • Data aquisition
        • GDS(Global distribution system)
        • airindia api
        • meta search engines - Kayak, sky scannce, google flights
      • Data 
        • Structured - flight number, origin, destination, date time, price, cabin class, #passengers, 
        • Semi Structured - json, xml  eg: messages
        • Unstructured - Search queries, ip addresses, 
    • AWS Services
    • Feature engineering
      • flight specific -> origin, dest, departure date, return, class, airline, no of stop, duration of flight, price
      • transaction specific -> PNR, credit card, discounts, points
      • contextual features -> user location, ip address, device type, app browser, os version
    • Derived features 
      • Route popularity
      • Price Volatility  
      • demand indicator
      • competitor pricing features, price diff
    • Models to build
      • price prediction -> regression
      • demand forecasting -> predict the number of bookings/search volumes for a particular route/origin-destination pairs -> regression or classification
      • Anomaly detection -> Identify unusual price spikes or patterns that could indicate, market disruptions
      • personalized recommendations -> 
        • linear models vs tree models
      • Neural networks
        • automatic feature engineering, capture complex patterns, more data, less interpretable
    • AWS Services
      • Building , training, deploying -> sage maker
    • Architecture
      • Data sources
        • GDS/Airlines API

    • LLMs as teaching assistant
      • Neural networks designed to understand, generate or respond to human language
      • Deep neural networks trained on massive datasets
      •  
      • Question answering
      • translation
      • summarization
      • sentiment analysis
    • Requirement: Often students have question related to the lesson. 
    • Architecture
      • User -> System/Server (Course, chats, doubts) -> Chatgpt
    • RAG -> Retrieval Augmented Generation
      • Optimizes the process of response generation by taking external document which are domain specific
    • Fraud analytics

    ML system design and data engineering - 14
    ML system design and data engineering - 14



    Comments

    Popular posts from this blog

    LangChain

    AutoGen