Decision Path For A Random Forest Classifier
Here is my code to run it in your environment, I am using the RandomForestClassifier and I am trying to figure out the decision_path for a selected sample in the RandomForestClassi
Solution 1:
RandomForestClassifier.decision_path
method returns a tuple
of (indicator, n_nodes_ptr)
.
see the documentation :
here
So your variable node_indicator
is a tuple and not what you think it is.
A tuple object has no attribute 'indices' that's why you get the error when you do :
node_index = node_indicator.indices[node_indicator.indptr[sample_id]:
node_indicator.indptr[sample_id + 1]]
try :
(node_indicator, _) = rf.decision_path(X_train)
You can also plot the decision tree of each tree of the forest for a single sample id :
X_train = X_train.values
sample_id = 0for j, tree in enumerate(rf.estimators_):
n_nodes = tree.tree_.node_count
children_left = tree.tree_.children_left
children_right = tree.tree_.children_right
feature = tree.tree_.feature
threshold = tree.tree_.threshold
print("Decision path for DecisionTree {0}".format(j))
node_indicator = tree.decision_path(X_train)
leave_id = tree.apply(X_train)
node_index = node_indicator.indices[node_indicator.indptr[sample_id]:
node_indicator.indptr[sample_id + 1]]
print('Rules used to predict sample %s: ' % sample_id)
for node_id in node_index:
if leave_id[sample_id] != node_id:
continueif (X_train[sample_id, feature[node_id]] <= threshold[node_id]):
threshold_sign = "<="else:
threshold_sign = ">"print("decision id node %s : (X_train[%s, %s] (= %s) %s %s)"
% (node_id,
sample_id,
feature[node_id],
X_train[sample_id, feature[node_id]],
threshold_sign,
threshold[node_id]))
Note that in your case, you have 50 estimators so it might be a bit boring to read.
Post a Comment for "Decision Path For A Random Forest Classifier"