How to read CSV file using Pyspark?


Reading data files using pyspark is pretty much easy with simple command.

However you need to add options depending on the format & content of the file.


Simple file read command:

Reading CSV file by passing header

Other options you can add are

escape : "\""
multiLine : "true"
delimiter : "|"
nullValue : "\\N"
inferSchema : "true"

Comments