Thursday, December 21, 2023

SRE Interview questions

1. Explain the concept of SLO, SLI, and SLA. How are they interconnected?

- Answer: Service Level Objectives (SLOs) define a target level of service. Service Level Indicators (SLIs) are metrics that measure the service, while Service Level Agreements (SLAs) are the agreements about the level of service. They are interconnected as SLIs are used to measure the achievement of SLOs, which, when met, ensure compliance with SLAs.

2. What is the significance of error budgets in an SRE context? How do you calculate and utilize error budgets?

- Answer: Error budgets represent the allowed error or downtime within a service before it impacts users. It quantifies how reliable the service needs to be. They are calculated by subtracting the error budget from 100%. Utilizing error budgets helps prioritize improvements and allows for controlled risk-taking during development.

3. Describe how you'd implement a service monitoring system from scratch.

- Answer: I'd start by identifying key metrics (like latency, error rates, etc.) and setting up monitoring tools (like Prometheus, Grafana). Then, I'd create alerting rules based on these metrics and establish dashboards for visualization. Additionally, implementing logging and tracing tools helps in comprehensive system monitoring.

4. Discuss the importance of chaos engineering in maintaining system reliability. Provide examples of chaos engineering experiments.

- Answer: Chaos engineering involves deliberately injecting failures into a system to test its resilience. It helps identify weaknesses and improve system robustness. For instance, simulating network outages, shutting down services randomly, or introducing latency to observe system behavior.

5. How do you handle incidents in a production environment? Explain your incident response process and any tools you might use.

- Answer: Our incident response involves detecting issues through monitoring, assigning severity levels, and activating incident response teams. We use incident management tools like PagerDuty or OpsGenie, follow predefined runbooks, conduct post-incident reviews, and update documentation.

6. Explain in detail the principles of "Error Budgets" and how it influences decision-making in an SRE team.

- Answer: Error budgets define the acceptable failure rate. It allows teams to balance stability and innovation. When the error budget is consumed, teams focus on reliability over new features. This principle guides the allocation of engineering resources for improvements.

7. Design an automated incident response system that can handle complex failures and prioritize critical incidents.

- Answer: I'd create an incident orchestration system that uses machine learning to predict incident severity. It would automatically trigger predefined response actions based on severity, escalating critical incidents to on-call teams and documenting incident resolution steps for future reference.

8. Discuss the role of service meshes like Istio in improving service observability and reliability.

- Answer: Service meshes like Istio manage communication between microservices, offering observability through metrics, logging, and tracing. They enhance reliability by providing fault tolerance, traffic control, and security features like mutual TLS.

9. How would you implement a zero-downtime deployment strategy for a large-scale microservices-based application?

- Answer: Utilize blue-green or canary deployments, ensuring multiple instances of each microservice. Deploy gradually, routing a portion of traffic to new versions, validating their performance, and gradually shifting all traffic to the new version.

10. Describe the process of setting up a multi-region, active-active architecture for disaster recovery and high availability.

- Answer: Implement redundant infrastructure in multiple regions, distribute traffic across regions using DNS or a global load balancer, and replicate data between regions. Use health checks to automatically reroute traffic in case of failures.

Friday, November 3, 2023

Some examples of #PromQL queries

Get the average CPU usage for all pods:

avg(container_cpu_usage_seconds_total{container=~".*"})

Get the number of requests per second to the web service:

rate(http_request_total{service="web"})

Get the 95th percentile of response times for the web service:

quantile(0.95, http_request_duration_seconds{service="web"})

Get the number of errors per second for the database:

rate(database_errors_total{database="mysql"})

Get the total amount of memory used by all pods:

sum(container_memory_usage_bytes{container=~".*"})

Get the percentage of requests that took longer than 100ms to respond

(rate(http_request_duration_seconds{service="web"} > 0.1) / rate(http_request_total{service="web"})) * 100

Some #interview #questions related to #Prometheus

1. What is Prometheus, and why is it popular for monitoring and alerting?

Answer: Prometheus is an opensource monitoring and alerting toolkit. It's popular because it is designed for reliability, scalability, and flexibility, making it suitable for cloudnative applications and complex environments.

2. Explain the key components of the Prometheus architecture.

Answer: The main components of Prometheus are the Prometheus server, exporters, Alertmanager, Pushgateway, and Grafana. The server collects and stores metrics, exporters collect data from services, Alertmanager handles alerts, Pushgateway is used for shortlived jobs, and Grafana provides visualization.

3. What is a Prometheus exporter, and how do you use it in monitoring?

Answer: A Prometheus exporter is a software component that collects and exposes metrics from various services or systems in a format that Prometheus can scrape. Exporters are used to monitor applications and infrastructure not natively supported by Prometheus.

4. Explain the difference between push and pullbased monitoring systems. How does Prometheus operate?

Answer: In a pushbased system, metrics are pushed to the monitoring system. In Prometheus, which is pullbased, the Prometheus server periodically scrapes metrics from exporters and services.

5. What is PromQL, and how do you use it to query data in Prometheus?

Answer: PromQL is the query language used with Prometheus. You use it to create queries and alerting rules to retrieve, filter, and manipulate timeseries data for monitoring and alerting purposes.

6. Differentiate between a gauge and a counter metric in Prometheus.

Answer: Gauges represent single numerical values that can go up and down, while counters are cumulative and increase over time. Counters are used for metrics like request counters, while gauges are used for values like CPU usage.

7. Explain how alerting works in Prometheus and the role of the Alertmanager.

Answer: Alerts are defined in Prometheus rules, and when they are satisfied, they are sent to the Alertmanager. Alertmanager manages routing, deduplication, and notification to various receivers (e.g., email, chat, or webhook).

8. What are service discovery mechanisms in Prometheus, and why are they important?

Answer: Service discovery mechanisms in Prometheus, such as DNS, static configuration, and cloud service discovery, automatically discover and monitor new instances of a service as they come and go, simplifying monitoring in dynamic environments.

9. What are recording rules, and why are they useful in Prometheus?

Answer: Recording rules are used to precompute and store frequently needed or computationally expensive queries as new time series. This optimizes query performance and reduces load on the Prometheus server.

10. How can you back up and restore Prometheus data, and why is this important?

Answer: You can back up Prometheus data by copying the data directory to a backup location. Restoring data involves copying the backedup data directory back to the original location. Backups are important for disaster recovery and historical data retention.

11. Explain how federation works in Prometheus and why it's used.

Answer: Federation in Prometheus allows multiple Prometheus servers to scrape data from one another. It's used for scaling, aggregation, and longterm storage of metrics.

12. What are some common best practices for using Prometheus in a production environment?

Answer: Best practices include regularly updating Prometheus, using labels effectively, setting up proper alerting, optimizing queries, and maintaining adequate storage and retention policies.

13. How do you secure a Prometheus setup, and what are the security best practices?

Answer: Security practices include restricting access to Prometheus endpoints, securing communication with TLS, and using authentication and authorization mechanisms. Additionally, it's important to keep Prometheus and its components updated to address security vulnerabilities.

14. What is the significance of exporters, and what types of exporters are commonly used with Prometheus?

Answer: Exporters are essential for collecting and exporting metrics from various services and systems. Commonly used exporters include the Node Exporter (for systemlevel metrics), Blackbox Exporter (for probing endpoints), and more.

15. Explain the difference between longterm storage and shortterm storage for Prometheus data.

Answer: Shortterm storage is typically inmemory storage for recent data, while longterm storage is used for persistent data retention, often in a timeseries database like Thanos or Cortex.

16. How can you integrate Prometheus with Grafana for visualization and dashboards?

Answer: Grafana can connect to Prometheus as a data source, allowing you to create interactive dashboards and visualize Prometheus data.

17. What are some strategies for handling alerting and preventing alert fatigue in Prometheus?

Answer: Strategies include using labels effectively, implementing alert aggregation, and setting up silences to temporarily suppress alerts during maintenance or known issues.

18. Explain how you can implement high availability for Prometheus in a production environment.

Answer: High availability can be achieved by deploying multiple Prometheus servers and using load balancing and federation. In addition, alertmanager can be set up in a highly available configuration.

19. What are some common monitoring solutions and technologies that integrate well with Prometheus?

Answer: Technologies that integrate well with Prometheus include Grafana, Kubernetes, Docker, cloud platforms (e.g., AWS, GCP), and various exporters for specific services and applications.

20. Can you describe a realworld scenario where Prometheus was instrumental in identifying and resolving a production issue?

Answer: The candidate should provide a specific example of a production issue where Prometheus played a key role in identifying the problem, allowing for swift resolution and improved system reliability.

How can you instrument your application to expose metrics for Prometheus?

Answer: You can use a Prometheus client library in your application's code to create custom metrics and expose them for scraping by Prometheus.

Explain the process of creating custom metrics in Prometheus.

Answer: To create custom metrics, you need to use a Prometheus client library compatible with your programming language. Define, register, and update your metrics in your application code.

What are Prometheus client libraries, and can you name a few for different programming languages?

Answer: Prometheus client libraries are libraries or packages that help developers instrument their code to expose metrics. Examples include `prometheusclient` for Python, `promclient` for Node.js, and `prometheusnet` for .NET.

How do you set up custom labels for your Prometheus metrics?

Answer: You can set custom labels for your metrics using the Prometheus client library in your application code, allowing you to add metadata to your metrics.

Explain the importance of metric naming conventions in Prometheus.

Answer: Metric names should be descriptive and follow a naming convention that helps others understand the purpose of the metric. A consistent naming convention improves metric discoverability and readability.

What is the role of unit tests in Prometheus metric instrumentation?

Answer: Unit tests are crucial to ensure that custom metrics are instrumented correctly. They verify that metric values are being updated as expected in your application code.

How can you handle changes to metric names or labels in your code without breaking existing Prometheus queries and dashboards?

Answer: You should follow best practices for metric naming and label naming, and avoid making breaking changes to existing metrics. Instead, create new metrics with updated names or labels to avoid breaking existing queries and dashboards.

Explain the purpose of histograms and summaries in Prometheus metrics.

Answer: Histograms and summaries are used to measure the distribution of values over time. They provide additional information beyond the average and can help in identifying outliers and performance issues.

What is the difference between push and pullbased exporters in Prometheus?

Answer: Pushbased exporters allow applications to push their metrics to Prometheus, while pullbased exporters expose an HTTP endpoint for Prometheus to scrape metrics.

How can you secure the communication between Prometheus and your application when using a pushbased exporter?

Answer: You can use encryption, such as HTTPS with SSL/TLS certificates, to secure the communication between Prometheus and your application's pushbased exporter.

Explain how to handle metric cardinality in Prometheus.

Answer: Metric cardinality refers to the number of unique label combinations for a metric. To manage it, keep labels cardinality in check, use relabeling, and consider use cases that benefit from a high cardinality.

What are blackbox exporters, and how can they be used in Prometheus monitoring?

Answer: Blackbox exporters are used to probe and monitor endpoints and services by sending HTTP requests to specific URLs and measuring the response times. They help monitor the external behavior of applications.

How can you expose metrics from a Docker container for Prometheus to scrape?

Answer: You can use the Prometheus Node Exporter or a Prometheus exporter specifically designed for Docker to expose metrics from Docker containers for scraping.

Explain the use of the `promtool` utility in Prometheus development.

Answer: `promtool` is used for various tasks, including validating and linting Prometheus configuration files, checking recording rules, and verifying alerting rules.

How can you simulate a production environment for testing Prometheus configurations and metric collection?

Answer: You can use Docker and Docker Compose to set up a testing environment with Prometheus, exporters, and simulated services that expose metrics for testing.

Explain how to configure alerting rules in Prometheus and use them for monitoring applications.

Answer: Alerting rules are configured in Prometheus to define conditions that trigger alerts. You can use the Alertmanager to manage and route these alerts to different notification channels.

What are remote storage integrations, and how can they be used to store Prometheus data?

Answer: Remote storage integrations allow you to store longterm Prometheus data in external timeseries databases, such as Thanos, Cortex, or InfluxDB, for scalability and longterm retention.

How can you use the Grafana dashboard for visualizing Prometheus metrics and alerts?

Answer: Grafana can connect to Prometheus as a data source, allowing you to create interactive dashboards, panels, and alerts for visualizing and analyzing Prometheus data.

Explain how to use Prometheus Federation for crosscluster or crossorganization monitoring.

Answer: Prometheus Federation allows multiple Prometheus servers to scrape and aggregate data from one another. It can be used for crosscluster or crossorganization monitoring and aggregation of metrics.

What are some best practices for developing and maintaining Prometheus metric instrumentation in production applications?

Answer: Best practices include setting up automated tests for metrics, documenting metric naming conventions, using custom labels effectively, and regularly reviewing and optimizing metrics for performance and resource usage.

Monday, July 10, 2023

#Apache #spark #dataengineer #questions

What is #ApacheSpark, and how does it relate to #dataengineering?

Answer: #ApacheSpark is an #opensource #distributed #computing #framework designed for #bigdata processing and #analytics. It provides an interface for programming and managing large-scale data processing tasks across a cluster of computers.

Explain the concept of #RDD (Resilient Distributed Datasets) in Spark.

Answer: #RDD is a fundamental data structure in #Spark that represents an immutable distributed collection of objects. It allows for fault-tolerant and parallel operations on data across a cluster.

How does #Spark #Streaming enable real-time data processing?

Answer: #Spark #Streaming allows processing of live #data streams in #realtime by breaking them into small batches. It provides high-level abstractions to handle continuous streams of data with the same APIs used for #batchprocessing.

What is the difference between #DataFrame and #RDD in #Spark?

Answer: #DataFrames are a higher-level abstraction built on top of #RDDs, providing a structured and #schema-based approach to data processing. They offer #optimizations for better performance and compatibility with various data formats and data sources.

How does #Spark handle data #partitioning and #parallel processing?

Answer: #Spark distributes data across multiple nodes in a cluster, allowing for parallel processing. It automatically partitions #RDDs into smaller partitions that can be processed in parallel across the available resources.

Explain the concept of lazy evaluation in #Spark.

Answer: #Spark uses lazy evaluation, meaning it postpones the execution of transformations until an action is called. This optimization technique allows Spark to optimize and optimize the execution plan dynamically.

What are the benefits of using #SparkSQL for data processing?

Answer: #SparkSQL provides a programming interface and optimizations for querying structured and semi-structured data using SQL queries. It combines the power of #SQL and the flexibility of #Spark's distributed computing capabilities.

How would you optimize the performance of #Spark jobs?

Answer: Performance optimization in Spark can be achieved by tuning various configurations, leveraging data partitioning, using appropriate caching, applying appropriate data compression techniques, and optimizing the execution plan through proper transformations and actions.

What is a Shuffle operation in Spark, and when is it triggered?

Answer: A Shuffle operation in Spark involves redistributing data across partitions during data processing. It is triggered when data needs to be reshuffled, such as during group-by operations or joins, and can have a significant impact on performance.

How would you handle failures and ensure fault tolerance in #Spark?

Answer: #Spark provides built-in mechanisms for fault tolerance, such as lineage information to recover lost data and checkpointing to store intermediate data. By leveraging these features, Spark can recover from failures and continue processing without data loss.

Top 20 #technical #question for #dataengineer

What is the role of a Data Engineer in an organization?

Answer: A Data Engineer is responsible for designing, developing, and maintaining the infrastructure and systems required for storing, processing, and analyzing large volumes of data in an organization.

What are the key components of a data pipeline?

Answer: The key components of a data pipeline include data ingestion, data storage, data processing, and data delivery. These components work together to ensure a smooth flow of data from various sources to the desired destinations.

What is the difference between batch processing and real-time processing?

Answer: Batch processing involves processing data in large volumes at specific intervals, whereas real-time processing deals with processing data as soon as it arrives, enabling immediate analysis and action.

What are some common data modeling techniques used in data engineering?

Answer: Common data modeling techniques include relational modeling (using tables and relationships), dimensional modeling (for data warehousing), and schema-less modeling (for NoSQL databases).

How do you ensure data quality in a data pipeline?

Answer: Data quality can be ensured by performing data validation, data cleansing, and data profiling. Implementing data quality checks at various stages of the pipeline helps identify and rectify any anomalies or errors.

What is ETL (Extract, Transform, Load) and how does it relate to data engineering?

Answer: ETL refers to the process of extracting data from various sources, transforming it into a consistent format, and loading it into a target system. Data Engineers often design and implement ETL processes to move and transform data effectively.

What is the role of data partitioning in distributed systems?

Answer: Data partitioning involves dividing large datasets into smaller, manageable partitions that can be processed and stored across multiple machines in a distributed system. It helps improve performance, scalability, and fault tolerance.

How do you handle big data processing challenges?

Answer: Big data processing challenges can be addressed by utilizing distributed processing frameworks like Apache Hadoop or Apache Spark, which allow for parallel processing and handling large volumes of data efficiently.

What is data warehousing, and how does it differ from a database?

Answer: Data warehousing involves consolidating and organizing data from various sources to support business intelligence and reporting. Unlike a traditional database, a data warehouse is optimized for querying and analyzing large datasets.

Explain the concept of data lakes.

Answer: A data lake is a central repository that stores structured and unstructured data in its raw format. It allows for flexible data exploration and analysis, enabling organizations to derive insights from diverse data sources.

What are the advantages of using cloud-based data storage and processing?

Answer: Cloud-based data storage and processing offer benefits like scalability, cost-effectiveness, and easy access to computing resources. It eliminates the need for organizations to invest in and manage their infrastructure.

How do you ensure data security in a data engineering project?

Answer: Data security can be ensured by implementing encryption techniques, access controls, data masking, and monitoring systems. Regular audits and compliance with security standards also play a vital role.

What is the role of Apache Kafka in data engineering?

Answer: Apache Kafka is a distributed streaming platform that enables real-time data processing and messaging between systems. It acts as a scalable and fault-tolerant data pipeline for handling high volumes of data.

What are the considerations for data backup and disaster recovery in data engineering?

Answer: Data backup and disaster recovery strategies involve creating regular backups, implementing redundant systems, and defining recovery point objectives (RPO) and recovery time objectives (RTO) to minimize data loss and downtime.

How do you optimize query performance in a data warehouse?

Answer: Query performance optimization can be achieved by proper indexing, partitioning, denormalization, and utilizing query optimization techniques provided by the database management system.

What are some data integration techniques commonly used in data engineering?

Answer: Data integration techniques include batch integration (scheduled data transfers), real-time integration (streaming data), and virtual integration (querying data from multiple sources without physical movement).

How do you handle data schema evolution in a data pipeline?

Answer: Data schema evolution can be managed by implementing versioning techniques, using flexible data formats like JSON or Avro, and handling schema changes with proper compatibility checks and data migration strategies.

What are the key considerations for data governance in a data engineering project?

Answer: Data governance involves defining policies, processes, and standards for data management, data quality, data privacy, and compliance. It ensures that data is handled responsibly and securely throughout its lifecycle.

Explain the concept of data streaming and its relevance in data engineering.

Answer: Data streaming involves processing and analyzing continuous streams of data in real-time. It is essential for applications that require immediate insights or actions based on rapidly changing data, such as IoT applications or fraud detection systems.

How do you ensure scalability and high availability in a data engineering system?

Answer: Scalability and high availability can be achieved by utilizing distributed systems, load balancing, replication, fault-tolerant architectures, and leveraging cloud infrastructure that provides auto-scaling capabilities.

Tuesday, March 7, 2023

Explain the types of tables in #Hive

In Apache Hive, there are two types of tables: managed tables and external tables.

Managed tables, also known as internal tables, are tables where Hive manages both the metadata and the data itself. When you create a managed table in Hive, it creates a directory in the default Hive warehouse location and stores the data in that directory. If you drop the table, Hive will delete the table metadata as well as the data directory. Managed tables are typically used for long-term data storage and are ideal for scenarios where you want Hive to control the data completely.

External tables, on the other hand, are tables where Hive only manages the metadata and the data is stored outside of the Hive warehouse directory. When you create an external table in Hive, you specify the location of the data directory where the data is stored. If you drop the external table, Hive only deletes the metadata and leaves the data directory intact. External tables are useful when you need to share data across multiple systems, or when the data is stored outside of the Hive warehouse directory.

In summary, the main difference between managed and external tables in Hive is where the data is stored and who controls it. With managed tables, Hive controls both the metadata and the data, while with external tables, Hive only controls the metadata, and the data is stored outside of the Hive warehouse directory.

what is the difference between client mode and cluster mode?

In the context of Apache Spark, client mode and cluster mode refer to different ways of running Spark applications.

In client mode, the driver program runs on the same machine that the Spark application is launched from. The driver program communicates with the cluster manager to request resources and schedule tasks on the worker nodes. The client mode is typically used for interactive workloads, where the user wants to have direct access to the results of the Spark application.

In cluster mode, the driver program runs on one of the worker nodes in the cluster, rather than on the client machine. The client machine submits the application to the cluster manager, which then launches the driver program on one of the worker nodes. The driver program then communicates with the cluster manager to request resources and schedule tasks on the remaining worker nodes. The cluster mode is typically used for batch workloads, where the Spark application is run as a part of a larger data processing pipeline.

The key difference between client mode and cluster mode is where the driver program is run. In client mode, the driver program runs on the client machine, which provides direct access to the application results. In cluster mode, the driver program runs on one of the worker nodes, which allows for better resource utilization and scalability for larger data processing workloads.

20 most asked #interview #question in #spark with #answers

What is Spark?
Spark is an open-source distributed computing system used for processing large-scale data sets. It provides high-level APIs for programming in Java, Scala, Python, and R.
What are the key features of Spark?
The key features of Spark include in-memory processing, support for a wide range of data sources, and built-in support for machine learning, graph processing, and streaming data processing.
What is an RDD in Spark?
RDD (Resilient Distributed Datasets) is the fundamental data structure in Spark. It is an immutable distributed collection of objects, which can be processed in parallel across multiple nodes.
What are the different transformations in Spark?
The different transformations in Spark include map, filter, flatMap, distinct, groupByKey, reduceByKey, sortByKey, join, and union.
What are the different actions in Spark?
The different actions in Spark include collect, count, first, take, reduce, save, foreach, and foreachPartition.
What is lazy evaluation in Spark?
Lazy evaluation is a feature in Spark where the transformations are not executed until an action is called. This reduces unnecessary computations and improves performance.
What is the difference between map and flatMap in Spark?
Map applies a function to each element in a RDD and returns a new RDD, while flatMap applies a function that returns an iterator to each element in a RDD and returns a flattened RDD.
What is the difference between transformation and action in Spark?
A transformation is a function that produces a new RDD from an existing one, while an action is a function that returns a result or saves data to a storage system.
What is Spark Streaming?
Spark Streaming is a component of Spark that allows processing of real-time data streams using Spark's batch processing engine.
What is Spark SQL?
Spark SQL is a module in Spark that allows processing of structured and semi-structured data using SQL-like queries.
What is Spark MLlib?
Spark MLlib is a machine learning library in Spark that provides scalable implementations of various machine learning algorithms.
What is a broadcast variable in Spark?
A broadcast variable is a read-only variable that can be cached on each machine in a cluster for more efficient data sharing.
What is SparkContext in Spark?
SparkContext is the entry point for Spark applications and represents the connection to a Spark cluster.
What is the role of the Driver program in Spark?
The Driver program is the main program that defines the transformations and actions to be performed on the data.
What is a cluster manager in Spark?
A cluster manager is responsible for managing the resources and scheduling the tasks across the nodes in a Spark cluster.
What is a Shuffle in Spark?
A Shuffle is the process of redistributing data across the nodes in a cluster to prepare it for a subsequent operation, such as a reduce operation.
What is a Partition in Spark?
A Partition is a logical unit of data in a RDD that can be processed in parallel across different nodes.
What is a DAG in Spark?
A DAG (Directed Acyclic Graph) is a data structure in Spark that represents the sequence of transformations and actions to be executed on a RDD.
What is a Spark Executor?
A Spark Executor is a worker node in a Spark cluster that executes tasks on behalf of the Driver program.
What is a Spark Worker?
A Spark Worker is a node in a Spark cluster that runs Executors and manages the resources allocated to them.

10 commonly asked #interview #questions in #Apache #Spark

Here are 10 commonly asked interview questions in Spark:

What is Spark? Explain its architecture and components.
What is the difference between MapReduce and Spark? When would you use one over the other?
What is RDD in Spark? Explain its properties and transformations.
What is lazy evaluation in #Spark? How does it impact performance?
What is a data frame in #Spark? How is it different from an RDD?
Explain the concept of partitioning in Spark.
What is Spark SQL? How is it used?
What is a Spark cluster? How does it differ from a Hadoop cluster?
What is Spark Streaming? How does it work?
What are the benefits of using Spark over other data processing frameworks?

Wednesday, May 3, 2017

Hadoop general interview questions

architecture component of Hadoop
os level optimisation
prerequisites before installing
how to bring data
what all we need to make sure in order to copy data from one cluster
scenarios and use of the scheduler
I want to implement department wise access level on hdfs and yarn
job flow in yarn
how resource allocation happens in yarn
what is the file read/write flow
how different nodes in a cluster communicate with each other
how the request flows through zookeepers
please tell something about read/write pipeline on Hadoop
how do you do deployments on many servers at once

Infinite Interview Questions

Pages