서둘러 시작하기

아침에 알람이 울린 시각보다 한참을 더 침대에서 뒤척이다 일어났다. 바쁜 마음에 서둘러 준비하고 일어난지 15분 만에 집을 나섰다.

남여 준비하는데 걸리는 시간
남여 준비하는데 걸리는 시간

차가운 아침 공기를 가르고 차가 있는 곳까지 단숨에 도착했는데.

웬걸.  키가 없다.  OTL

난 서두르다 제대로 준비하지 못하고 시작해서 낭패를 보거나, 꼭 낭패까지는 아니어도 초초함에 일을 효율적으로 하지 못하는 때가 많다.

빨리 하는 것보다 제대로 하는 것이 중요하다.  제대로 하기 위해서는 시작하기 전에 신중히 계획하고 검토하는 것이 중요하다.

너희 중에 누가 망대를 세우고자 할찐대 자기의 가진 것이 준공하기까지에 족할는지 먼저 앉아 그 비용을 예산하지 아니하겠느냐 – 누가복음 14:28

서두르지 말자.  분주하지 말자.

 

Kaggler’s Toolbox – Setup (from Kaggler.com)

This article is originally posted on Kaggler.com.


I’d like to open up my toolbox that I’ve built for data mining competitions, and share with you.

Let me start with my setup.

System

I have access to 2 machines:

  • Laptop – Macbook Pro Retina 15″, OS X Yosemite, i7 2.3GHz 4 Core CPU, 16GB RAM, GeForce GT 750M 2GB, 500GB SSD
  • Desktop – Ubuntu 14.04, i7 5820K 3.3GHz 6 Core CPU, 64GB RAM, GeForce GT 620 1GB, 120GB SSD + 3TB HDD

I purchased the desktop from eBay around at $2,000 a year ago (September 2014).

Git

As the code repository and version control system, I use git.

It’s useful for collaboration with other team members.  It makes easy to share the code base, keep track of changes and resolve conflicts when two people change the same code.

It’s useful even when I work by myself too.  It helps me reuse and improve the code from previous competitions I participated in before.

For competitions, I use gitlab instead of github because it offers unlimited number of private repositories.

S3 / Dropbox

I use S3 to share files between my machines.  It is cheap – it costs me about $0.1 per month on average.

To access S3, I use AWS CLI.  I also used to use s3cmd and like it.

I use Dropbox to share files between team members.

Makefile

For flow control or pipelining, I use makefiles (or GNU make).

It modularizes the long process of a data mining competition into feature extraction, single model training, and ensemble model training, and controls workflow between components.

For example, I have a top level makefile that defines the raw data file locations, folder hierarchies, and target variable.

[code title=”Makefile” lang=”bash”]
# directories
DIR_DATA := data
DIR_BUILD := build
DIR_FEATURE := $(DIR_BUILD)/feature
DIR_VAL := $(DIR_BUILD)/val
DIR_TST := $(DIR_BUILD)/tst

DATA_TRN := $(DIR_DATA)/train.csv
DATA_TST := $(DIR_DATA)/test.csv

Y_TRN := $(DIR_DATA)/y.trn.yht

$(Y_TRN): $(DATA_TRN)
cut -d, -f2 $< | tail -n +2 > $@
[/code]

Then, I have makefiles for features that includes the top level makefile, and defines how to generate training and test feature files in various formats (CSV, libSVM, VW, libFFM, etc.).

[code title=”Makefile.feature.feature3″ lang=”bash”]
include Makefile

FEATURE_NAME := feature3

FEATURE_TRN := $(DIR_FEATURE)/$(FEATURE_NAME).trn.sps
FEATURE_TST := $(DIR_FEATURE)/$(FEATURE_NAME).tst.sps

FEATURE_TRN_FFM := $(DIR_FEATURE)/$(FEATURE_NAME).trn.ffm
FEATURE_TST_FFM := $(DIR_FEATURE)/$(FEATURE_NAME).tst.ffm

$(FEATURE_TRN) $(FEATURE_TST): $(DATA_TRN) $(DATA_TST) | $(DIR_FEATURE)
src/generate_feature3.py –train-file $< \
–test-file $(lastword $^) \
–train-feature-file $(FEATURE_TRN) \
–test-feature-file $(FEATURE_TST)
%.ffm: %.sps
src/svm_to_ffm.py –svm-file $< \
–ffm-file $@ \
–feature-name $(FEATURE_NAME)

[/code]

Then, I have makefiles for single model training that includes a feature makefile, and defines how to train a single model and produce CV and test predictions.

[code title=”Makefile.xg” lang=”bash”]
include Makefile.feature.feature3

N = 400
DEPTH = 8
LRATE = 0.05
ALGO_NAME := xg_$(N)_$(DEPTH)_$(LRATE)
MODEL_NAME := $(ALGO_NAME)_$(FEATURE_NAME)

PREDICT_VAL := $(DIR_VAL)/$(MODEL_NAME).val.yht
PREDICT_TST := $(DIR_TST)/$(MODEL_NAME).tst.yht
SUBMISSION_TST := $(DIR_TST)/$(MODEL_NAME).sub.csv

all: validation submission
validation: $(METRIC_VAL)
submission: $(SUBMISSION_TST)
retrain: clean_$(ALGO_NAME) submission

$(PREDICT_TST) $(PREDICT_VAL): $(FEATURE_TRN) $(FEATURE_TST) \
| $(DIR_VAL) $(DIR_TST)
./src/train_predict_xg.py –train-file $< \
–test-file $(word 2, $^) \
–predict-valid-file $(PREDICT_VAL) \
–predict-test-file $(PREDICT_TST) \
–depth $(DEPTH) \
–lrate $(LRATE) \
–n-est $(N)

$(SUBMISSION_TST): $(PREDICT_TST) $(ID_TST) | $(DIR_TST)
paste -d, $(lastword $^) $< > $@

[/code]

Then, I have makefiles for ensemble features that defines which single model predictions to be included for ensemble training.

[code titile=”Makefile.feature.esb9″ lang=”bash”]
include Makefile

FEATURE_NAME := esb9

BASE_MODELS := xg_600_4_0.05_feature9 \
xg_400_4_0.05_feature6 \
ffm_30_20_0.01_feature3 \

PREDICTS_TRN := $(foreach m, $(BASE_MODELS), $(DIR_VAL)/$(m).val.yht)
PREDICTS_TST := $(foreach m, $(BASE_MODELS), $(DIR_TST)/$(m).tst.yht)

FEATURE_TRN := $(DIR_FEATURE)/$(FEATURE_NAME).trn.csv
FEATURE_TST := $(DIR_FEATURE)/$(FEATURE_NAME).tst.csv

$(FEATURE_TRN): $(Y_TRN) $(PREDICTS_TRN) | $(DIR_FEATURE)
paste -d, $^ > $@

$(FEATURE_TST): $(Y_TST) $(PREDICTS_TST) | $(DIR_FEATURE)
paste -d, $^ > $@
[/code]

Finally, I can (re)produce the submission from XGBoost ensemble with 9 single models described in Makefile.feature.esb9 by (1) replacing include Makefile.feature.feature3 in Makefile.xg with include Makefile.feature.esb9 and (2) running:

$ make -f Makefile.xg

SSH Tunneling

When I’m connected to Internet, I always ssh to the desktop for its computational resources (mainly for RAM).

I followed Julian Simioni’s tutorial to allow remote SSH connection to the desktop.  It needs an additional system with a publicly accessible IP address.  You can setup an AWS micro (or free tier) EC2 instance for it.

tmux

tmux allows you to keep your SSH sessions even when you get disconnected.  It also let you split/add terminal screens in various ways and switch easily between those.

Documentation might look overwhelming, but all you need are:
# If there is no tmux session:
$ tmux

or

# If you created a tmux session, and want to connect to it:
$ tmux attach

Then to create a new pane/window and navigate in between:

  • Ctrl + b + " – to split the current window horizontally.
  • Ctrl + b + % – to split the current window vertically.
  • Ctrl + b + o – to move to next pane in the current window.
  • Ctrl + b + c – to create a new window.
  • Ctrl + b + n – to move to next window.

To close a pane/window, just type exit in the pane/window.

 

Hope this helps.

Next up is about machine learning tools I use.

Please share your setups and thoughts too. 🙂

Kaggler 0.4.0 Released (from Kaggler.com)

This article is originally posted at Kaggler.com.

UPDATE on 9/15/2015

I found a bug in OneHotEncoder, and fixed it.  The fix is not available on pip yet, but you can update Kaggler to latest version from the source as follows:

$ git clone https://github.com/jeongyoonlee/Kaggler.git
$ cd Kaggler
$ python setup.py build_ext --inplace
$ sudo python setup.py install

If you find a bug, please submit a pull request to github or comment here.


I’m glad to announce the release of Kaggler 0.4.0.

Kaggler is a Python package that provides utility functions and online learning algorithms for classification.  I use it for Kaggle competitions along with scikit-learn, LasagneXGBoost, and Vowpal Wabbit.

Kaggler 0.4.0 added the scikit-learn like interface for preprocessing, metrics, and online learning algorithms.

kaggler.preprocessing

Classes in kaggler.preprocessing now support fit, fit_transform, and transform methods. Currently 2 preprocessing classes are available as follows:

  • Normalizer – aligns distributions of numerical features into a normal distribution. Note that it’s different from sklearn.preprocessing.Normalizer, which only scales features without changing distributions.
  • OneHotEncoder – transforms categorical features into dummy variables.  It is similar to sklearn.preprocessing.OneHotEncoder except that it groups infrequent values into a dummy variable.

[code language=”python”]
from kaggler.preprocessing import OneHotEncoder

# values appearing less than min_obs are grouped into one dummy variable.
enc = OneHotEncoder(min_obs=10, nan_as_var=False)
X_train = enc.fit_transform(train)
X_test = enc.transform(test)
[/code]

kaggler.metrics

3 metrics are available as follows:

  • logloss – calculates the bounded log loss error for classification predictions.
  • rmse – calculates the root mean squared error for regression predictions.
  • gini – calculates the gini coefficient for regression predictions.

[code language=”python”]
from kaggler.metrics import gini

score = gini(y, p)
[/code]

kaggler.online_model

Classes in kaggler.online_model (except ClassificationTree) now support fit, and predict methods. Currently 5 online learning algorithms are available as follows:

  • SGD – stochastic gradient descent algorithm with hashing trick and interaction
  • FTRL – follow-the-regularized-leader algorithm with hashing trick and interaction
  • FM – factorization machine algorithm
  • NN (or NN_H2) – neural network algorithm with a single (or double) hidden layer(s)
  • ClassificationTree – decision tree algorithm

[code language=”python”]
from kaggler.online_model import FTRL
from kaggler.data_io import load_data

# load a libsvm format sparse feature file
X, y = load_data(‘train.sparse’, dense=False)

# FTRL
clf = FTRL(a=.1, # alpha in the per-coordinate rate
b=1, # beta in the per-coordinate rate
l1=1., # L1 regularization parameter
l2=1., # L2 regularization parameter
n=2**20, # number of hashed features
epoch=1, # number of epochs
interaction=True) # use feature interaction or not

# training and prediction
clf.fit(X, y)
p = clf.predict(X)
[/code]

Latest code is available at github.
Package documentation is available at https://pythonhosted.org/Kaggler/.

Please let me know if you have any comments or want to contribute. 🙂

Catching Up (from Kaggler.com)

This article is originally posted at Kaggler.com.

Many things have happened since the last post in February.

1. Kaggle and other competitions

2. Kaggler package

  • Kaggler 0.3.8 was released.
  • Fellow Kaggler, Jiming Ye added an online tree learner to the package.

I will post about each update soon.  Stay tuned! 🙂

Kaggler.com

I started a new blog, Kaggler.com to write mainly about Data Science competitions at Kaggle.

I’ve enjoyed participating in Kaggle competitions since 2011.  In every competition, I learned new things – new algorithms (Factorization Machine, Follow-the-Regularized-Leader), new tools (Vowpal Wabbit, XGBoost), and/or new domains.  It’s been really helpful for me to be up-to-date in the fast evolving fields of Machine Learning and Data Science.

With Kaggler.com, I’d like to share my learning and experiences with others.  Hope it can be useful to someone.

60 Day Journey of Deloitte Churn Prediction Competition

Competition

Last December, I teamed up with Michael once again to participate in the Deloitte Churn Prediction competition at Kaggle, where to predict which customers will leave an insurance company in the next 12 months.

It was a master competition, which is open to only master level Kagglers (top 0.2% out of 138K competitors), with $70,000 cash prizes for top 3 finishers.

Result

We managed to do well and finished in 4th place out of 37 teams in spite of that we did not have much time due to projects at work and family events (especially for Michael, who became a dad during the competition).

Although we were little short to earn the prize, it was a fun experience working together with Michael, competing with other top competitors across the world, and climbing the leaderboard day by day.

Visualization

I visualized our  60 day journey during the competition below, and here are some highlights (for us):

  • Day 22-35: Dived into the competition, set up the github repo and S3 for collaboration, and climbed up the leaderboard quickly.
  • Day 41-45: Second spurt.  Dug in GBM and NN models.  Michael’s baby girl was born on Day 48.
  • Day 53-60: Last spurt.  Ensembled all models.  Improved our score every day, but didn’t have time to train the best models.

Motion Chart - Deloitte Churn Prediction Leaderboard

Once clicked the image above, it will show a motion chart where:

  • X-axis: Competition day.  From day 0 to day 60.
  • Y-axis: AUC score.
  • Colored circle: Each team.  If clicked, it shows which team it represents.
  • Right most legend: Competition day.  You can drag up and down the number to see the chart on a specific day.

Initial positions of circles show the scores of their first submissions.

For the chart, I reused the code using rCharts published by Tony Hirst at github: https://github.com/psychemedia (He also wrote a tutorial on his blog about creating a motion chart using rCharts).

Closing

We took a rain check on this, but will win next time!  🙂

Related Articles:

Machine Learning as a Service vs Feature – BigML vs. Infer

bigml_vs_infer

This is a personal follow-up for the LA Machine Learning meetup, David Gerster @ eHarmony on October 8th, 2013.

David Gerster, VP of Data Science at BigML, also former Director of Data Science at Groupon, gave an overview of BigML’s Machine Learning (ML) platform for predictive analytics. Here are my thoughts about its business model and other alternatives offering ML.

1. BigML

BigML offers, so called, Machine-Learning-as-a-Service (MLaaS), that allows users (ex. a wine seller) to upload their data (ex. historical wine sales data) to BigML and get predictions for the variable of interest (ex. right prices for new wines, expected sales number for future) while keeping users from complicated Machine Learning algorithms, which are key components of predictive analytics.

2. Machine Learning as a Service

As of November 2013, there are a few start-up companies + Google offering (or claiming to offer) MLaaS other than BigML:

It is similar to Analytics-as-a-Service (AaaS) in a way that it delivers the power of predictive analytics to users, but it is different from AaaS in a way that it leaves data ETL (Extract-Transform-Load) and problem identification steps to users.

The pros and cons of MLaaS over AaaS would be:

  • Pros – Cheaper: $150-300 / month for MLaaS from BigML vs. $200-300 / hour / person (+ hardware, licensing fees) for AaaS from analytics consulting firms
  • Cons – Harder: ETL and problem identification are by far the hardest parts in predictive modeling.  No matter how good your algorithm is, garbage-in leads garbage-out (ETL), and aiming wrong target leads wrong predictions (problem definition).

If you know what you’d like to predict and how to clean up your data, then MLaaS would be the right solution for you.  However, for many prospective users of MLaaS (those who are inexperienced in data analytics), I guess that it’s not the case.

3. Machine-Learning-as-a-Feature, Infer.com

Another business model other than MLaaS and AaaS to provide users with the power of ML for their predictive modeling needs is Machine-Learning-as-a-Feature (MLaaF, Don’t google it.  I just made it up).

Infer, a startup founded in 2010, is on this track, and offers ML plugins for popular CRM softwares (Salesforce, Marketo and Eloqua) to predict sales leads.

By focusing on the specific need (sales lead prediction) and  specific users (those who use popular CRM softwares), Infer manages to provide the power of ML with the painless user experience and affordable price tag.

Closing Thought

To be fair, BigML’s user interface and visualization are quite impressive, and it is equipped with the Random Forest algorithm, which is one of most popular algorithms with good out-of-the-box performance.  For data scientists who do not have much ML experience, it will be worth to try out.

However, I believe that for most of users, it will work best with well defined MLaaF rather than MLaaS (Investors seem to agree with me based on the fact that Infer got 10M in funding compared to BigML got 1M from CrunchBase profiles).

Related Articles:

Data Science Career for Neuroscientists + Tips for Kaggle Competitions

main-qimg-a08366402fddbb5bd52ab151f51c7dfa

Recently Prof. Konrad Koerding at Northwestern University asked for an advice on his Facebook for one of his Ph.D student, who studies Computational Neuroscience but wants to pursue his career in Data Science.  It reminded me of the time I was looking for such opportunities, and shared my thoughts (now posted on the webpage of his lab here).  I decide to post it here too (with a few fixes) so that it can help others.

First, I’d like to say that Data Science is a relatively new field (like Computational Neuroscience), and you don’t need to feel bad to make the transition after your Ph.D.  When I was out to the job market, I didn’t have any analytic background at all either.

I started my industrial career at one of analytic consulting companies, Opera Solutions in San Diego, where one of Nicolas‘ friends, Jacob, runs the R&D team of the company.  Jacob did his Ph.D under the supervision of Prof. Michael Arbib at University of Southern California in Computational Neuroscience as well.  During the interview, I was tested to prove my thought process, basic knowledges in statistics and Machine Learning, and programming, which I’d practiced through out my Ph.D everyday.

So, if he has a good Machine Learning background with programming skills (I’m sure that he does, based on the fact he’s your student), he can be competent to pursue his career in Data Science.

Tools in Data Science

Back in the graduate school, I used mostly MATLAB with some SPSS and C.  In the Data Science field, Python and R are most popular languages, and SQL is a kind of necessary evil.

R is similar to MATLAB except that it’s free.  It is not a hardcore programming language and doesn’t take much time to learn.  It comes with the latest statistical libraries and provides powerful plotting functions.  There are many IDEs, which make easy to use R, but my favorite is R Studio.  If you run R on the server with R Studio Server, you can access it from anywhere via your web browser, which is really cool.  Although native R plotting functions are excellent by themselves, the ggplot2 library provides more eye-catching visualization.

For Python, Numpy + Scipy packages provides similar vector-matrix computation functionalities as MATLAB.  For Machine Learning algorithms, you need Scikit-Learn, and for data handling, Pandas will make your life easy.  For debugging and prototyping, iPython Notebook is really handy and useful.

SQL is an old technology but still widely used.  Most of data are stored in the data warehouse, which can be accessed only via SQL or SQL equivalents (Oracle, Teradata, Netezza, etc.).  Postgres and MySQL are powerful yet free, so it’s perfect to practice with.

Hints for Kaggle Data Mining Competitions

Fortunately, I had a chance to work with many of top competitors such as the 1st and 2nd place teams at Netflix competitions, and learn how they do at competitions.  Here are some tips I found helpful.

1. Don’t jump into algorithms too fast.

Spend enough time to understand data.  Algorithms are important, but no matter how good algorithm you use, garbage-in only leads to garbage-out.  Many classification/regression algorithms assume the Gaussian distributed variables, and fail to make good predictions if you provide non-Gaussian distributed variables.  So, standardization, normalization, non-linear transformation, discretization, binning are very important.

2. Try different algorithms and blend.  

There is no universal optimal algorithm.  Most of times (if not all), the winning algorithms are ensembles of many individual models with tens of different algorithms.  Combining different kinds of models can improve prediction performance a lot.  For individual models, I found Random Forest, Gradient Boosting Machine, Factorization Machine, Neural Network, Support Vector Machine, logistic/linear regression, Naive Bayes, and collaborative filtering are mostly useful.  Gradient Boosting Machine and Factorization Machine are often the best individual models.

3. Optimize at last.

Each competition has a different evaluation metric, and optimizing algorithms to do the best for that metric can improve your chance to win.  Two most popular metrics are RMSE and AUC (area under the ROC curve).  Algorithms optimizing one metric is not the optimal for the other. Many open source algorithm implementations provide only RMSE optimization, so for AUC (or other metric) optimization, you need to implement it by yourself.

Related Articles:

For Teenagers to Learn Programming – Where and How to Start

code monkey, code!

When I talk to teenagers or their parents, I always recommend teenagers to learn a programming language because:

  • It is a fun toy like LEGO that allows you to create whatever you like instead of playing with what others created.
  • It is a universal language like music, arts, and sports that allows you to communicate with and reach out to other people around the world. 
  • It is an effective tool that increases your productivity to infinity by automating and delegating your work to a computer (or a “cloud” of computers).
  • Since it is a language like other spoken human languages, the earlier you start, the easier and faster you can learn.

In return, I’ve been asked how and where to start.  Here are some good resources to start with:

1. Get Motivated

  • Leaders and trend-setters all agree on one thing – Leaders in business, government, arts and entertainment, and education share why they think programming is important.
  • Why software is eating the world – Wall Street Journal article by Marc Andreessen, one of top venture capitalists, about how and why the software industry has been taking over other industries such as healthcare, financial, telecom, arts and entertainment, etc.

2. Resources

3. Which language to learn

Here I list some languages that, I think, are relatively easy to learn, fast to get working results, and useful even after becoming professional (except Scratch).

  • For kids to get familiar with programming – Scratch, which is designed for ages 8 to 16 to create programs without using programming languages (check out the TED video below to see how it looks like).
  • For high schoolers or beginners with interests in developing working programs – Python, which is an easy yet powerful language that can cover almost every verticals (from web pages to scientific programs).
  • For designers with interests in developing interactive art works – Processing, which offers beautiful and interactive visualization capabilities.
  • For those who are interested in Finance or business intelligence – R, which is originally developed as a statistical computing language, but widely used in predictive modeling, data exploration, and data visualization.

I hope everyone who comes across this article can give a try to learn programming, and discover how fun and useful it is. 🙂

Related Articles:

Faith and Suffering

Inspired by today’s Family Life Today, “Why Me”.

Gerald Sittser lost his wife, mother and daughter in an accident that left him wondering, “Why?”. Jerry came to realize that God didn’t promise him a pain-free life, but promised instead to be with him in his loss and suffering.

신앙과 고난

우리는 믿음을 지키고 바르게 살면 하나님께서 좋은 – 고난 없는 – 삶을 허락하실 것이라는 소박한 – 어떤 사람들처럼 크게 “성공”하는 “축복”을 바라는 것이 아니라는 면에서 – 믿음을 가지고 있다. 그러기에 우리는 삶 가운데 어려움을 만나게 되면, 우선 무엇을 잘못했는지 찾고, 그것을 바로 잡음으로서 고난에서 벗어나려고 한다.

하지만, 많은 경우 우리는, 질병, 사고, 성실함 뒤에 따르는 실패, 옳은 일을 한 뒤에 따르는 어려움 등, 딱히 잘못한 것이 없는데도, 소위 까닭없는 고난을 마주치게 된다. 그러면 우리는 당황해하고, “Why me?”라는 질문을 던지며, 하나님께 그 잘못된 상황을 – 때론 따지기도 하며 – 바로잡아 주시길 기도한다.

이러한 반응은 고난은 정상적이지 않은, 잘못된 것이라는 인식에서 비롯된다.

하지만, 성경에서 고난은 오히려 우리 삶 가운데 반드시 필요한 것으로 보여진다. 성경은 고난은 믿는 자에게 당연히 따르는 것이라 말하고 있다.

  • 오직 하나님의 능력을 따라 복음과 함께 고난을 받으라. (디모데후서 1:8 중)
  • 선을 행함으로 고난 받는 것이 하나님의 뜻일진데. (베드로전서 3:17 중)

하나님께서 우리에게 약속하신 것은 성령과 부활이지, 고난 없는 삶이 아니다. 대신, 성경은 고난 중에 하나님이 우리의 위로가 되시고, 훗날 (혹은 부활 후에) 보상 받을 것을 말하고 있다. 또한 고난을 통해 우리가 더 성숙해지고, 고난을 당한 다른 사람들을 위로할 수 있게 되기에 고난 자체로도 유익하다고 말한다.

  • 우리의 모든 환난 중에서 우리를 위로하사, 우리로 하여금 하나님께 받는 위로로써 모든 환난 중에 있는 자들을 능히 위로하게 하시는 이시로다. (고린도후서 1:4)
  • 이 말씀은 나의 고난 중의 위로라. 주의 말씀이 나를 살리셨기 때문이니이다. (시편 119:50)
  • 현재의 고난은 장차 우리에게 나타날 영광과 비교할 수 없도다. (로마서 8:18)
  • 고난 당하기 전에는 내가 그릇 행하였더니, 이제는 주의 말씀을 지키나이다. 고난 당한 것이 내게 유익이라. 이로 말미암아 내가 주의 율례들을 배우게 되었나이다. (시편 119:67, 71)

나 또한 삶의 가장 힘들었던 기간에 내 슬픔과 고통을 아시고 위로하고 격려해 주시는 하나님을 만났다. 그리고, 그 기간을 극복한 후에는 그 이전과 비교할 수 없을 정도로 성숙하고 강한 내가 되는 경험을 한 적이 있다. 내 삶의 고난은 나로 하여금 하나님을 더 가깝게 체험하고, 하나님께 더 가까이 가게 만든 너무도 소중한 자산이다.

  • 그러나 내가 가는 길을 그가 아시나니 그가 나를 단련하신 후에는 내가 정금같이 되어 나오리라. (욥기 23:10)

그렇다면, 우리는 삶 가운데 고난을 당할 때 어떻게 해야할까.

성경은 고난을 당할 때, 인내하고 기도하라고 이야기 한다.

  • 소망 중에 즐거워하며, 환난 중에 참으며, 기도에 항상 힘쓰며 (로마서 12:12)
  • 너희 중에 고난 당하는 자가 있느냐, 그는 기도할 것이요. 즐거워하는 자가 있느냐, 그는 찬송할지니라. (야고보서 5:13)

신약에서 가장 많은 고난을 당한 인물이라고 할 수 있는 바울은 어떤 상황에서든지 기뻐하라고 하며 자신은 고난 중에도 즐거워한다고 고백하고 있다.

  • 주 안에서 항상 기뻐하라. 내가 다시 말하노니, 기뻐하라. (빌립보서 4:4)
  • 우리가 환난 중에도 즐거워하나니 이는 환난은 인내를, 인내는 연단을, 연단은 소망을 이루는 줄 앎이로다. (로마서 5:3,4)

바울은 자신이 항상 기뻐할 수 있는 비결을 염려하지 않고, 감사하며, 기도하는 것이며, 그럴 때에 하나님의 평강이 자신의 생각과 마음을 지킨다고 이야기한다. 또한 그로인해 자신이 어떤 환경에 처하던지 모든 일을 할 수 있다고 고백한다.

  • 아무 것도 염려하지 말고, 다만 모든 일에 기도와 간구로, 너희 구할 것을 감사함으로 하나님께 아뢰라. 그리하면 모든 지각에 뛰어난 하나님의 평강이 그리스도 예수 안에서 너희 마음과 생각을 지키시리라. (빌립보서 4:6-7)
  • 나는 비천에 처할 줄도 알고, 풍부에 처할 줄도 알아, 모든 일 곧 배부름과 배고픔과 풍부와 궁핍에도 처할 줄 아는 일체의 비결을 배웠노라. 내게 능력 주시는 자 안에서 내가 모든 것을 할 수 있느니라. (빌립보서 4:12-13)

이 얼마나 멋진 고백인가.

마지막으로 예수님께서는 산상수훈을 마무리 지으며, 반석 위에 세운 집 (말씀대로 사는 사람)과 모래 위에 세운 집 (말씀대로 살지 않는 사람) 모두 비바람과 풍랑을 맞을 것을 말씀하신다.

  • 나의 이 말을 듣고 행하는 자는 그 집을 반석 위에 지은 지혜로운 사람 같으리니, 비가 내리고 창수가 나고 바람이 불어 그 집에 부딪치되 무너지지 아니하나니, 이는 주추를 반석 위에 놓은 까닭이요. 나의 이 말을 듣고 행하지 아니하는 자는 그 집을 모래 위에 지은 어리석은 사람 같으리니, 비가 내리고 창수가 나고 바람이 불어 그 집에 부딪치매, 무너져 그 무너짐이 심하니라. (마태복음 7:24-27)

단, 반석 위에 세운 집은 비비람과 풍랑 속에서도 굳건히 선다고 말씀하신다.

우리 모두 삶 가운데 고난을 만날 때 바울처럼 의연히 대처하고, 반석 위에 세운 집처럼 굳건히 설 수 있길 바란다.

이를 위해 고난이 없을 때 말씀대로 살고, 고난 중에 염려하지 않고 기도와 감사로 대처하는 우리가 되길 바란다.