Progress Seminar 2019.2.12 신희안
연구 진행 상황 보고서 인공 심폐기 Suture Force HF Classification 기타 2주전 계획 연구 결과 문제점 Software validation(동아) 배터리 성능검사 KIMES 기기 전달 나일론 시편 제작 배송 완료 여러 알고리즘 적용 문제점 및 대책 Flowmeter 1대 고장 목표 및 계획 데이터 처리 논문 참고
① ② Data Preprocessing …. …. X : input Y : death related parameter Parameters …. Drop column: Missing>Threshold Patients Drop row: with missing value …. X : input Y : death related parameter MATLAB preprocessing Pandas(python lib.) preprocessing
:output = follow up duration Random Forest :output = follow up duration n_estimator: number of trees in forest max_depth: max number of levels in each decision tree max_features: number of features considered when splitting node 80%: Train dataset 20%: Test dataset Each epoch: Randomly pick hyperparameter values for the model Fit train dataset to the model 5 fold cross validation of the model with test dataset Print out score for each selected hyperparameter model Can hold both categorical & numerical Normalization not needed Hyperparameter tuning 중요
Random Forest <Regression1> <Regression2> X : input Y : Follow up duration (206 params.) Select parameters (29 params.) with Pearson correlation> 0.1 Follow up duration row X : input Y : Follow up duration R^2 score: Train data score = 0.926 Test data score = 0.481 hyperparameter 변화에도 여전히 5fold validation score 변화가 거의 없음 R^2 score: Train data score = 0.752 Test data score = 0.520
All params. Hi-corr. Params. Random Forest Follow up duration 9 Categories (6 months * n) Regression Classification <Classification> All params. Hi-corr. Params. Random Forest ~30% Xgboost ANN ~65% Score = Accuracy….? Xgboost : builds tree one at a time 유사한 accuracy…. Input data 변형 필요