Logistic Regression With Spark Ml (data Frames)
I wrote the following code for logistic regression, I want to use the pipeline API provided by spark.ml. However it gave me an error after I try to print coefficients and intercept
Solution 1:
You can access individual stages using stages
attribute of the PipelineModel
from pyspark.ml import Pipeline
from pyspark.ml.classification import LogisticRegression, LogisticRegressionModel
from pyspark.ml.feature import VectorAssembler
df = sc.parallelize([
(0.0, 1.0, 2.0, 4.0),
(1.0, 3.0, 4.0, 5.0)
]).toDF(["label", "x1", "x2", "x3"])
assembler = (VectorAssembler()
.setInputCols(df.columns[1:])
.setOutputCol("features"))
lr = LogisticRegression(maxIter=10, regParam=0.01)
pipeline = Pipeline(stages=[assembler, lr])
model = pipeline.fit(data)
[stage.coefficients for stage in model.stages ifhasattr(stage, "coefficients")]
## [DenseVector([2.1178, 1.6843, -1.8338])]## or
[stage.coefficients for stage in model.stages
ifisinstance(stage, LogisticRegressionModel)]
## [DenseVector([2.1178, 1.6843, -1.8338])]
Solution 2:
Try this
pipeline=Pipeline(stages=[assembler, lr])
model = pipeline.fit(trainingData)
lrm = model.stages[-1]
lrm.coefficients
Post a Comment for "Logistic Regression With Spark Ml (data Frames)"