Configuring your cluster and optimizing your configuration (like how to split the data so that it can be sent to the various machines) is a large topic on its own.
配置您的集群并优化您的配置(比如分割数据以便将其发送到各个机器)本身就是一个大课题。
3
Those log files can be huge, but the work will be split up among the machines (nodes) in your Hadoop cluster.