Lab- Big Data

 Your submission should be in a word document including all screen shots for all steps you do. 

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Big Data – Hadoop Ecosystems

Lab #3

  • Import the accounts table into HDFS file system:
  • 1) Import account:

    $ sqoop import \
    –connect jdbc:mysql://localhost/loudacre \
    –username training –password training \
    –table accounts \
    –target-dir /loudacre/accounts \
    –null-non-string ‘\\N’

    Save Time On Research and Writing
    Hire a Pro to Write You a 100% Plagiarism-Free Paper.
    Get My Paper

    2) List the contents of the accounts directory:

    $ hdfs dfs -ls /loudacre/accounts

    3) Import incremental updates to accounts

    As Loudacre adds new accounts in MySQL accounts table, the account data in HDFS must be
    updated as accounts are created. You can use Sqoop to append these new records.

    Run the add_new_accounts.py script to add the latest accounts to MySQL.

    $ DEV1/exercises/sqoop/add_new_accounts.py

    Incrementally import and append the newly added accounts to the accounts
    directory. Use Sqoop to import on the last value on the acct_num column
    largest account ID:

    $ sqoop import \
    –connect jdbc:mysql://localhost/loudacre \
    –username training –password training \
    –incremental append \
    –null-non-string ‘\\N’ \
    –table accounts \
    –target-dir /loudacre/accounts \
    –check-column acct_num \
    –last-value

    4) You should see three new files. Use Hadoop’s cat command to view the entire contents of
    these files.

    hdfs dfs -cat /loudacre/accounts/part-m-0000[456]

      Import the accounts table into HDFS file system:

    Developing with Spark and Hadoop:

    Big Data – Hadoop Ecosystems

    Lab #2

    Import the Device table from MySQL

    1. Open a new terminal window if necessary.

    2. Get familiar with Sqoop by running the sqoop command line

    $Sqoop help

    3. List the table in Loudacre database:

    $ sqoop list-tables \
    –connect jdbc:mysql://localhost/loudacre \

    –username training –password training

    4. Run the sqoop import command to see its options:

    $ sqoop import –help

    5. *Use Sqoop to import the device table in the loudacre database and save it in HDFS under /loudacre:

    Big Data Hadoop Ecosystems

    Lab #1
    Setup and General Notes

    Dr. Gasan Elkhodari

    Lab #1 – General Note

    This Lab uses the a Virtual Machine running the CentOS Linux distribution. This VM has CDH

    (Cloudera’s Distribution, including Apache Hadoop) installed in Pseudo-Distributed mode. Pseudo-

    Distributed mode is a method of running Hadoop whereby all Hadoop daemons run on the same

    machine. It is, essentially, a cluster consisting of a single machine. It works just like a larger Hadoop

    cluster, the only difference (apart from speed, of course!) being that the block replication factor is

    set to 1, since there is only a single Data Node available.

    Lab#1 – HDFS Setup

    Enable services and set up any data required for the course. You must run this script before starting
    the Lab.

    $ $DEV1/scripts/training_setup_dev1.sh

    Lab#1 HDFS Setup – Continue

    Lab#1 – Access HDFS with Command Line

    • Assignment

    1) Move the data folder “KB” that is under the location
    “/home/training/training_materials/data” into the Hadoop file
    system /loudacre

    Hints:
    • Use ‘hdfs dfs -mkdir’ command to create a new directory ‘loudacre’
    • Use ‘hdfs dfs –put’ command line to move the data from the local Linux file

    system into HDFS file system

    Calculate your order
    Pages (275 words)
    Standard price: $0.00
    Client Reviews
    4.9
    Sitejabber
    4.6
    Trustpilot
    4.8
    Our Guarantees
    100% Confidentiality
    Information about customers is confidential and never disclosed to third parties.
    Original Writing
    We complete all papers from scratch. You can get a plagiarism report.
    Timely Delivery
    No missed deadlines – 97% of assignments are completed in time.
    Money Back
    If you're confident that a writer didn't follow your order details, ask for a refund.

    Calculate the price of your order

    You will get a personal manager and a discount.
    We'll send you the first draft for approval by at
    Total price:
    $0.00
    Power up Your Academic Success with the
    Team of Professionals. We’ve Got Your Back.
    Power up Your Study Success with Experts We’ve Got Your Back.

    Order your essay today and save 30% with the discount code ESSAYHELP