The goal of this project is predit the chronic kidney disease using parameters such as Specific Gravity, Hyper Tension, Hemoglobin, Diabetes Mellitus, Albumin , Appetite, Red Blood Cell Count, Pus Cell etc.
- Numpy
- Matplotlib
- Seaborn
- Sklearn
- Pandas
This is open source data set taken from kaggle.
Attributes in given data set:-
age - age
bp - blood pressure
sg - specific gravity
al - albumin
su - sugar
rbc - red blood cells
pc - pus cell
pcc - pus cell clumps
ba - bacteria
bgr - blood glucose random
bu - blood urea
sc - serum creatinine
sod - sodium
pot - potassium
hemo - hemoglobin
pcv - packed cell volume
wc - white blood cell count
rc - red blood cell count
htn - hypertension
dm - diabetes mellitus
cad - coronary artery disease
appet - appetite
pe - pedal edema
ane - anemia
class - class
Attribute Information:
We use 24 + class = 25 ( 11 numeric ,14 nominal)
Age(numerical) age in years
Blood Pressure(numerical) bp in mm/Hg
Specific Gravity(nominal) sg - (1.005,1.010,1.015,1.020,1.025)
Albumin(nominal) al - (0,1,2,3,4,5)
Sugar(nominal) su - (0,1,2,3,4,5)
Red Blood Cells(nominal) rbc - (normal,abnormal)
Pus Cell (nominal) pc - (normal,abnormal)
Pus Cell clumps(nominal) pcc - (present,notpresent)
Bacteria(nominal) ba - (present,notpresent)
Blood Glucose Random(numerical) bgr in mgs/dl
Blood Urea(numerical) bu in mgs/dl
Serum Creatinine(numerical) sc in mgs/dl
Sodium(numerical) sod in mEq/L
Potassium(numerical) pot in mEq/L
Hemoglobin(numerical) hemo in gms
Packed Cell Volume(numerical)
White Blood Cell Count(numerical) wc in cells/cumm
Red Blood Cell Count(numerical) rc in millions/cmm
Hypertension(nominal) htn - (yes,no)
Diabetes Mellitus(nominal) dm - (yes,no)
Coronary Artery Disease(nominal) cad - (yes,no)
Appetite(nominal) appet - (good,poor)
Pedal Edema(nominal) pe - (yes,no)
Anemia(nominal) ane - (yes,no)
Class (nominal) class - (ckd,notckd)
Steps followed during data cleaning:-
- Replace categorical values into numerical values.
- Correct the mis-spelled categorical values.
- Drop the null values.
Steps followed in EDA:-
- Find out correlation in data set
- Random Forest Algorithm
- AdaBoostClassifier
- GradientBoosting
- Logistic Regression
- Naive Bayes
- KNN
- Kernal SVM
- Decision Tree
Perform the hyperparameter tuning to avoid the overfitting in model.
Create website using flask and use model for prediction.

