Reading data files using pyspark is pretty much easy with simple command.
However you need to add options depending on the format & content of the file.
Simple file read command:
Reading CSV file by passing header
Other options you can add are
escape : "\""
multiLine : "true"
delimiter : "|"
nullValue : "\\N"
inferSchema : "true"
Comments
Post a Comment
Your Comments are more valuable to improve. Please go ahead