Intro
Apache Impala is an open source massively parallel processing (MPP) SQL query engine for data stored in a computer cluster running Apache Hadoop. It consists of different processes that run on specific hosts within your CDH cluster. The Domo Apache Impala SSH connector brings your data from Apache server securely through an SSH tunnel into Domo. The Apache Impala SSH Connector is a “Database” connector, meaning it retrieves data from a database using a query. In the Data Center, you can access the connector page for this and other Database connectors by clicking Database in the toolbar at the top of the window. This topic discusses the fields and menus that are specific to the Apache Impala SSH connector user interface. General information for adding DataSets, setting update schedules, and editing DataSet information is discussed in Adding a DataSet Using a Data Connector.Prerequisites
To connect to your Apache Impala database and create a DataSet, you must have the following:- The username and password you use to log into SSH Host
- The SSH host you wish to tunnel through
- The port number of your SSH host
- The SSH private key
- The username and password you use to log into your Apache Impala database
- The host name or IP address for the database server (e.g. db.company.com ).
- The port number for the database
- The database name
Connecting to Your Apache Impala database
This section enumerates the options in the Credentials and Details panes in the Apache Impala SSH Connector page. The components of the other panes in this page, Scheduling and Name & Describe Your DataSet, are universal across most connector types and are discussed in greater length in Adding a DataSet Using a Data Connector.Credentials Pane
This pane contains fields for entering credentials to connect to your (third-party tool) account. The following table describes what is needed for each field:Field | Description |
|---|---|
| SSH Server Host name | Enter the SSH host name you wish to tunnel through. |
| SSH Port | Enter the port number of your SSH host. |
| SSH Username | Enter the username you use to log into SSH Host. |
| SSH Password | Enter the password you use to log into SSH Host. |
| SSH Private Key | Enter the SSH private key. |
| Host | Enter the hostname or IP address of your database server. Example: db.company.com |
| Database Port | Enter your Apache Impala port number. |
| Database Name | Enter your Apache Impala database/schema name. |
| Username | Enter your Apache Impala username. |
| Password | Enter your Apache Impala password. |
| Database Connection String Parameter(s) | Enter the parameter(s) you want to include in the database connection string. Multiple parameters are separated by a semicolon. (Example: AuthMech=3;SSL=1;AllowSelfSignedCerts=1) |
Details Pane
This pane contains a primary Reports menu, along with various other menus which may or may not appear depending on the report type you select.Menu | Description | ||||
|---|---|---|---|---|---|
Query Type | Select a query type.
| ||||
| Query | Enter the SQL query to execute. The query will execute on the Apache Impala server and fetch the data from it. | ||||
| Query Parameter | Enter the query parameter value. It is the initial value for query parameter. The last run date is optional. The default value for the last date is ‘02/01/1700’ if not provided. Example: | ||||
| Database Table | Select the database table. | ||||
| Table Columns | Select the table columns. | ||||
| Query Helper | This query is automatically generated when you select a table and columns in the Database Table and Table Columns fields, respectively. Copy and paste this query into the Query field if you need help building a query. |
Other Panes
For information about the remaining sections of the connector interface, including how to configure scheduling, retry, and update options, see Adding a DataSet Using a Data Connector.FAQs
What kind of credentials do I need to power up this connector?
What kind of credentials do I need to power up this connector?
How frequently will my data update?
How frequently will my data update?
Are there any API limits that I need to be aware of?
Are there any API limits that I need to be aware of?
Can I use the same Apache Impala account to create multiple datasets?
Can I use the same Apache Impala account to create multiple datasets?
What do I need to be aware of while writing a query?
What do I need to be aware of while writing a query?
Why can't I connect to my Apache Impala database? Do I need to whitelist any IP addresses?
Why can't I connect to my Apache Impala database? Do I need to whitelist any IP addresses?