我按照网上的方法在文件log4j.properties中配置
# Set everything to be logged to the console
log4j.rootCategory=WARN, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
# Settings to quiet third party logs that are too verbose
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
这样使用spark-shell,可以看到只有warn信息输出,很简洁。
worker.Worker-1-lin-spark.out
lin@lin-spark:/opt/data01/spark-1.3.0-bin-2.6.0-cdh5.4.0$ bin/spark-shell
Spark assembly has been built with Hive, including Datanucleus jars on classpath
16/05/21 10:56:52 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.3.0
/_/
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_05)
Type in expressions to have them evaluated.
Type :help for more information.
16/05/21 10:56:56 WARN Utils: Your hostname, lin-spark resolves to a loopback address: 127.0.1.1; using 10.170.56.63 instead (on interface eth0)
16/05/21 10:56:56 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Spark context available as sc.
SQL context available as sqlContext.
但是使用IDEA写完代码后运行,依旧很多INFO,这是怎么回事,怎么处理?
6/05/21 10:57:52 INFO MemoryStore: Block broadcast_52_piece0 stored as bytes in memory (estimated size 2.0 KB, free 253.4 MB)
16/05/21 10:57:52 INFO BlockManagerInfo: Added broadcast_52_piece0 in memory on localhost:56191 (size: 2.0 KB, free: 256.8 MB)
16/05/21 10:57:52 INFO BlockManagerMaster: Updated info of block broadcast_52_piece0
16/05/21 10:57:52 INFO SparkContext: Created broadcast 52 from broadcast at DAGScheduler.scala:839
16/05/21 10:57:52 INFO DAGScheduler: Submitting 1 missing tasks from Stage 39 (MapPartitionsRDD[98] at map at homework3.scala:67)
16/05/21 10:57:52 INFO TaskSchedulerImpl: Adding task set 39.0 with 1 tasks
16/05/21 10:57:52 INFO TaskSetManager: Starting task 0.0 in stage 39.0 (TID 654, localhost, PROCESS_LOCAL, 1322 bytes)
16/05/21 10:57:52 INFO Executor: Running task 0.0 in stage 39.0 (TID 654)
16/05/21 10:57:52 INFO HadoopRDD: Input split: file:/opt/data02/sparkApp/IndexSearch/IRdata/reut2-007_491:0+4503
16/05/21 10:57:52 INFO Executor: Finished task 0.0 in stage 39.0 (TID 654). 1845 bytes result sent to driver
16/05/21 10:57:52 INFO TaskSetManager: Finished task 0.0 in stage 39.0 (TID 654) in 54 ms on localhost (1/1)
16/05/21 10:57:52 INFO TaskSchedulerImpl: Removed TaskSet 39.0, whose tasks have all completed, from pool
16/05/21 10:57:52 INFO DAGScheduler: Stage 39 (first at homework3.scala:68) finished in 0.054 s
16/05/21 10:57:52 INFO DAGScheduler: Job 29 finished: first at homework3.scala:68, took 0.056794 s
|
|