Your company has a Microsoft Azure environment that contains an Azure HDInsight Hadoop cluster and an Azure SQL data warehouse. The Hadoop cluster contains text files that are formatted by using UTF-8 character encoding.
You need to implement a solution to ingest the data to the SQL data warehouse from the Hadoop cluster. The solution must provide optimal read performance for the data after ingestion.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
You manage a Microsoft Azure HDInsight Hadoop cluster. All of the data for the cluster is stored in Azure Premium Storage.
You need to prevent all users from accessing the data directly. The solution must allow only the HDInsight service to access the data.
Winch five actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
You extend the dashboard of the health tracking application to summarize fields across several users.
You need to recommend a file format for the activity data in Azure that meets the technical requirements.
What is the best recommendation to achieve the goal? More than one answer choice may achieve the goal. Select the BEST answer.
You are designing a solution that will use Apache HBase on Microsoft Azure HDInsight.
You need to design the row keys for the database to ensure that client traffic is directed over all of the nodes in the cluster.
What are two possible techniques that you can use? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point.
You are designing a solution based on the lambda architecture.
The solution has the following layers;
You are planning the data ingestion process and the query execution.
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.
Your company has a data visualization solution that contains a customized Microsoft Azure Stream Analytics solution. The solution provides data to a Microsoft Power BI deployment.
Every 10 seconds, you need to query for instances that have more than three records.
How should you complete the query? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content. NOTE: Each correct selection is worth one point.
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
Your company has multiple databases that contain millions of sales transactions.
You plan to implement a data mining solution to identity purchasing fraud.
You need to design a solution that mines 10 terabytes (TB) of sales data.
The solution must meet the following requirements:
Run the analysis to identify fraud once per week.
Continue to receive new sales transactions while the analysis runs.
Be able to stop computing services when the analysis is NOT running.
Solution: You create a Microsoft Azure Data Lake job.
Does this meet the goal?
Users report that when they access data that is more than one year old from a dashboard, the response time is slow.
You need to resolve the issue that causes the slow response when visualizing older data.
What should you do?
A. Process the event hub data first, and then process the older data on demand.
B. Process the older data on demand first, and then process the event hub data.
C. Aggregate the older data by time, and then save the aggregated data to reference data streams.
D. Store all of the data from the event hub in a single partition.
You have four on-premises Microsoft SQL Server data sources as described in the following table.
You plan to create three Azure data factories that will interact with the data sources as described in the following table.
You need to deploy Microsoft Data Management Gateway to support the Azure Data Factory deployment. The solution must use new servers to host the instances of Data Management Gateway.
What is the minimum number of new servers and data management gateways you should you deploy? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
You have a Microsoft Azure Data Factory pipeline.
You discover that the pipeline fails to execute because data is missing.
You need to rerun the failure in the pipeline.
Which cmdlet should you use?