<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>초보아기 아빠 머신러닝 공부 블로그</title>
    <link>https://deadsquart.tistory.com/</link>
    <description>초보 아기 아빠 머신 러닝 공부 일지
- 유명강좌를 내가 보기 쉽게 정리한 블로그</description>
    <language>ko</language>
    <pubDate>Sat, 4 Jul 2026 15:55:25 +0900</pubDate>
    <generator>TISTORY</generator>
    <ttl>100</ttl>
    <managingEditor>초코린</managingEditor>
    <image>
      <title>초보아기 아빠 머신러닝 공부 블로그</title>
      <url>https://tistory1.daumcdn.net/tistory/3075244/attach/df924f46a0724534a488f68c8b5fd341</url>
      <link>https://deadsquart.tistory.com</link>
    </image>
    <item>
      <title>10.7 Gibbs Sampling</title>
      <link>https://deadsquart.tistory.com/72</link>
      <description>&lt;p&gt;Gibbs samplling 은 M-H 알고리즘의 special case 이다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cbgvUL/btqARxO2Ueo/vKxH8zWkGQrcag5PoWtcxK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cbgvUL/btqARxO2Ueo/vKxH8zWkGQrcag5PoWtcxK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cbgvUL/btqARxO2Ueo/vKxH8zWkGQrcag5PoWtcxK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcbgvUL%2FbtqARxO2Ueo%2FvKxH8zWkGQrcag5PoWtcxK%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;M-H 알고리즘에서 특징적인 부분은 Proposal distribution 이 있었다는 점이다.&lt;/p&gt;
&lt;p&gt;Proposal distribution 에 대해서 우리가 어떠한 특징을 가지고 있느지 알수 없기 때문에 그래서 ratio를 만들어서&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;accept probability 를 정의하였다.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;그러면 Probosal distribution을 어떻게 잡아볼까?&amp;nbsp; 그런데 Gibbs가 왜 새로운 probosal distribution을 잡으려해? 기존의 잘 정의 된 p를 이용하면 되지않을까라는 아이디어를 제안하였다.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;한번 sampling을 할때마다 개별 zk 에 대해서만 update만 *로 업데이트하고 나머지는 Keep 해보자.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;Gibbs가 제안한것은 아래이다.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/tWOy3/btqANDDQ7za/w5bKYu97kVsCfpxCuggrS1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/tWOy3/btqANDDQ7za/w5bKYu97kVsCfpxCuggrS1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/tWOy3/btqANDDQ7za/w5bKYu97kVsCfpxCuggrS1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FtWOy3%2FbtqANDDQ7za%2Fw5bKYu97kVsCfpxCuggrS1%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;q라는 것을 새롭게 만들지 말고 기존의 P를 이용해보자는 것이다. 한 latent variable z*k에 대해서만 update 해보자는 것이다. 위처럼 q를 정의하면&amp;nbsp; accept probability 가 사라지게 되고(1이 된다.), detailed balanced equation 이 만족됨을 보일수 있다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;그러면 accept probability는 언제나 1이 된다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dfgX22/btqAN2KabjA/Zcu0jL9PkjVZFyKHfnmkP0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dfgX22/btqAN2KabjA/Zcu0jL9PkjVZFyKHfnmkP0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dfgX22/btqAN2KabjA/Zcu0jL9PkjVZFyKHfnmkP0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdfgX22%2FbtqAN2KabjA%2FZcu0jL9PkjVZFyKHfnmkP0%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b28sMC/btqAOKa9Qew/ehGV7zBDsKGHRRnNb49Ub0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b28sMC/btqAOKa9Qew/ehGV7zBDsKGHRRnNb49Ub0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b28sMC/btqAOKa9Qew/ehGV7zBDsKGHRRnNb49Ub0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb28sMC%2FbtqAOKa9Qew%2FehGV7zBDsKGHRRnNb49Ub0%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;For loop를 아주 긴시간(T) 동안 돌린다. 한 variable update 치고, 그다음 다른 variable을 update 하는데 , 그동안 다른 variable은 given 이다라고 생각하고 , 개별 case 마다 sample을 통해 업데이트하는것이다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bxhQk1/btqANCEXRr3/RhMt8B3BCW58v27HzfI4E1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bxhQk1/btqANCEXRr3/RhMt8B3BCW58v27HzfI4E1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bxhQk1/btqANCEXRr3/RhMt8B3BCW58v27HzfI4E1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbxhQk1%2FbtqANCEXRr3%2FRhMt8B3BCW58v27HzfI4E1%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;개별 point 마다 z에 대한 assignment가 되는것이다.&lt;/p&gt;
&lt;p&gt;깁스는 sampling 기반이기 때문에 converge 속도가 느릴수 있다. 하지만 정확도가 높을수 있다.&lt;/p&gt;
&lt;p&gt;EM은 saddle / local 에 대한 problem이 발생할수 있다. 하지만 EM이 빠르게 동작할수 있다.&lt;/p&gt;</description>
      <category>머신러닝/문일철 교수님 강의 정리 (인공지능및기계학습개론)</category>
      <category>GibbsSampling</category>
      <category>GMM</category>
      <category>깁스샘플링</category>
      <author>초코린</author>
      <guid isPermaLink="true">https://deadsquart.tistory.com/72</guid>
      <comments>https://deadsquart.tistory.com/72#entry72comment</comments>
      <pubDate>Wed, 1 Jan 2020 20:26:37 +0900</pubDate>
    </item>
    <item>
      <title>10.6 Metropolis-Hastings Algorithm</title>
      <link>https://deadsquart.tistory.com/71</link>
      <description>&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/CHIMC/btqAmsJli4E/AiogjBHUh8eCfGqhPSG0n0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/CHIMC/btqAmsJli4E/AiogjBHUh8eCfGqhPSG0n0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/CHIMC/btqAmsJli4E/AiogjBHUh8eCfGqhPSG0n0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FCHIMC%2FbtqAmsJli4E%2FAiogjBHUh8eCfGqhPSG0n0%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;MH 알고리즘은 현재 zt라고 assignment 되어있는것을 다음 transient 를 통해 어떠한 assignment로 바뀔지 candidate하는 것이다. 기존의 정보를 활용(zt)에서 제안하는것이 MCMC의 근간에서 온것이다. 이것을 Proposal distriburion 이라 한다.&lt;/p&gt;
&lt;p&gt;그다음 &amp;alpha; 에 따라 coin toss를 해서 accept가 일어나면 z*를 받아 들이고 accept 되지 않으면 기존의 zt를 가져가는것이 MH 알고리즘이의 핵심이라 할수 있다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cgwikT/btqAowcIV3Z/gNtI6Ng7mrNRcpy7f0kn9K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cgwikT/btqAowcIV3Z/gNtI6Ng7mrNRcpy7f0kn9K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cgwikT/btqAowcIV3Z/gNtI6Ng7mrNRcpy7f0kn9K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcgwikT%2FbtqAowcIV3Z%2FgNtI6Ng7mrNRcpy7f0kn9K%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;q는 우리가 정하는것이다. p는 우리가 evaluate 할수 있는것이다.&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;Reversible Markov chain의 condition을 만족하면 되겠다는것이 &lt;/span&gt;MH 알고리즘에서 accept probaibliy를 정하는 핵심이 된다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;만약 위의 ratio r이 1보다 작으면 z*에서 zt로 transient 되는 확률보다 zt에서 z*로 가는 확률이 더 크다고 할수 있다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;그렇게 되면 detailed balance를 만족시키지 못하고, Reversible Markov chain 을 만족시키지 못한다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;q에 대한 특정한 정보가 없기 때문에 충분히 발생할수 있는 현상이다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;그래서 accept 와 reject를 통해 조절을 하여 ,&amp;nbsp;z*와 zt의 비율을 조절해서 accept probability를 만들어 보겠다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b5pYZH/btqAo7cHhnV/ZzUYMeb2VmKnzCXWBZA64K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b5pYZH/btqAo7cHhnV/ZzUYMeb2VmKnzCXWBZA64K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b5pYZH/btqAo7cHhnV/ZzUYMeb2VmKnzCXWBZA64K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb5pYZH%2FbtqAo7cHhnV%2FZzUYMeb2VmKnzCXWBZA64K%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;위에서 1로 보낸다는것은 확률적으로 max로 높이는것이라 할수 있다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;즉 잘 정의되지 않은 q의 상태에 대해서도 ratio를 통해 만들어진 acceptance probability 를 활용해서 detailed balance equation을 만족시켜주고, 그것을 &lt;span style=&quot;color: #333333;&quot;&gt;Reversible Markov chain 으로 만들어주고&amp;nbsp; 이것이 바로 나중에 계속 transition을 통해 &lt;/span&gt;stationary distribution으로 바뀔 수 있다는 것을 알게 되었다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/t1Fth/btqAm4nXPLn/RVBqH8ovxaeqn1LvLex621/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/t1Fth/btqAm4nXPLn/RVBqH8ovxaeqn1LvLex621/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/t1Fth/btqAm4nXPLn/RVBqH8ovxaeqn1LvLex621/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Ft1Fth%2FbtqAm4nXPLn%2FRVBqH8ovxaeqn1LvLex621%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;Random walk는 어떻게 할수 있을까?&lt;/p&gt;
&lt;p&gt;z*를 noramal distribution 에서 나온다고 해보자. z*는 기존의 zt를 통해 나오는것이다.&lt;/p&gt;
&lt;p&gt;위의 그림에서 sample t에서 zt에서 z*로 transit 하게 되면 z*가 sample t+1에서는 zt가 되는것이고 여기서 다시 nomal에서 sampling을 해서 z*를 찾아내는 것이다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bX0zs9/btqAm4nYhFk/3k53BZPhuSc3j5uVSy7e3K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bX0zs9/btqAm4nYhFk/3k53BZPhuSc3j5uVSy7e3K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bX0zs9/btqAm4nYhFk/3k53BZPhuSc3j5uVSy7e3K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbX0zs9%2FbtqAm4nYhFk%2F3k53BZPhuSc3j5uVSy7e3K%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;&amp;sigma;가 작으면 이동하는 폭이 작은 상태에서 sampling이 된다.. 정밀도가 높아진다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;&amp;sigma;가 크면 이동하는 폭을 넓게 하여 sampling을 하게 된다.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;Random walk M-H를 쓸때는 처음에는 &lt;span style=&quot;color: #333333;&quot;&gt;&amp;sigma;를 크게해서 mode를 잘찾아 보고 mode가 잘 찾아지면 &lt;span style=&quot;color: #333333;&quot;&gt;&amp;sigma;를 작게해서 정밀하게 &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;color: #333333;&quot;&gt;&lt;span style=&quot;color: #333333;&quot;&gt;&lt;span style=&quot;color: #333333;&quot;&gt;sampling하는 과정이 필요해 보인다.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;</description>
      <category>머신러닝/문일철 교수님 강의 정리 (인공지능및기계학습개론)</category>
      <category>MetropolisHastings</category>
      <category>메트로폴리스헤이스팅스</category>
      <author>초코린</author>
      <guid isPermaLink="true">https://deadsquart.tistory.com/71</guid>
      <comments>https://deadsquart.tistory.com/71#entry71comment</comments>
      <pubDate>Wed, 11 Dec 2019 21:40:02 +0900</pubDate>
    </item>
    <item>
      <title>Week 10.5 Markov Chain for Sampling</title>
      <link>https://deadsquart.tistory.com/70</link>
      <description>&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/oipZ8/btqAnoZOCKM/dr3FAukJQDslAK0jjfZkxk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/oipZ8/btqAnoZOCKM/dr3FAukJQDslAK0jjfZkxk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/oipZ8/btqAnoZOCKM/dr3FAukJQDslAK0jjfZkxk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FoipZ8%2FbtqAnoZOCKM%2Fdr3FAukJQDslAK0jjfZkxk%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;기존 샘플링의 문제점은 옛날 레코드는 사용하지 않는다는것.&lt;/p&gt;
&lt;p&gt;Rejection sampling은 조건이 맞지 않으면 버렸다.&lt;/p&gt;
&lt;p&gt;Important sampling은 개별 sample들이 다 독립이었다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Inference 문제에서는 Z를 Assign 하는것이 핵심이다. 그러면 sampling을 통해 z를 assign 해보자.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/N1Mif/btqAnYTQUvy/IvmNVca3YOwxMY5ZmVIQlK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/N1Mif/btqAnYTQUvy/IvmNVca3YOwxMY5ZmVIQlK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/N1Mif/btqAnYTQUvy/IvmNVca3YOwxMY5ZmVIQlK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FN1Mif%2FbtqAnYTQUvy%2FIvmNVca3YOwxMY5ZmVIQlK%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;E가 latent variable이면 sampling기반으로 optimize 해보자.&lt;/p&gt;
&lt;p&gt;Random variable에 대해 assign을 하려하는데&amp;nbsp; process 형태로 해보자.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dxC7dJ/btqAjWKEBJn/CcsSXSNeFQWz6knigyZkZK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dxC7dJ/btqAjWKEBJn/CcsSXSNeFQWz6knigyZkZK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dxC7dJ/btqAjWKEBJn/CcsSXSNeFQWz6knigyZkZK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdxC7dJ%2FbtqAjWKEBJn%2FCcsSXSNeFQWz6knigyZkZK%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;z1에서 assignment가 있었다고 해보자. 기존 정보를 활용해 다음번 정보에 활용하겠다는것이다. 이것이 Markob chain을 활용한 latent variable에대한 assignment 방식이다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/boT5yU/btqAms9vl4K/KRkxleiRTIMgIGZSwNPj7K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/boT5yU/btqAms9vl4K/KRkxleiRTIMgIGZSwNPj7K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/boT5yU/btqAms9vl4K/KRkxleiRTIMgIGZSwNPj7K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FboT5yU%2FbtqAms9vl4K%2FKRkxleiRTIMgIGZSwNPj7K%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;MCMC는 Stationary Distribution이 알려져 있다고 생각해보자. 그러면 Stationary Distribution을 만드는 Transtion Matrix는 무엇일까? 이것에 대해 관심있는것이 MCMC 이다.&lt;/p&gt;
&lt;p&gt;Inference는 assignment 가 중요한데 ,transition matrix를 잘 만들어서 , 그것을 활용해서 sampling을 매 term 마다 해보자는것이 MCMC의 내용이다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;처음에는 마음대로 정한 assignment 를 지속적으로 다음 assignment에 참조해서 sampling을 통한 assignment를 쭉하면 , stationary distribution 과 유사해지지 않을까 하는것이 MCMC의 핵심내용이다.&lt;/p&gt;</description>
      <category>머신러닝/문일철 교수님 강의 정리 (인공지능및기계학습개론)</category>
      <category>MonteCarlo</category>
      <category>마코프체인</category>
      <category>마코프체인몬테카를로방법</category>
      <author>초코린</author>
      <guid isPermaLink="true">https://deadsquart.tistory.com/70</guid>
      <comments>https://deadsquart.tistory.com/70#entry70comment</comments>
      <pubDate>Tue, 10 Dec 2019 20:20:36 +0900</pubDate>
    </item>
    <item>
      <title>Week 10.4 Markov Chain</title>
      <link>https://deadsquart.tistory.com/69</link>
      <description>&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/PJcRe/btqAaWhZ2EI/5NkcllYY80q8BYKWP6ze30/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/PJcRe/btqAaWhZ2EI/5NkcllYY80q8BYKWP6ze30/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/PJcRe/btqAaWhZ2EI/5NkcllYY80q8BYKWP6ze30/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FPJcRe%2FbtqAaWhZ2EI%2F5NkcllYY80q8BYKWP6ze30%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;특정 time t 의 한 state 에서 그 다음 state t+1 로는 Matrix를 통해 transition 된다고 해보자.&lt;/p&gt;
&lt;p&gt;(i 에서 j번째로 transition 될 확률의 Matrix T 로 구성 )&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/weLjj/btqz73C4T0t/BPHf8ZdEVkU3AA3aAVPQ80/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/weLjj/btqz73C4T0t/BPHf8ZdEVkU3AA3aAVPQ80/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/weLjj/btqz73C4T0t/BPHf8ZdEVkU3AA3aAVPQ80/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FweLjj%2Fbtqz73C4T0t%2FBPHf8ZdEVkU3AA3aAVPQ80%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Accesiible&lt;/b&gt; :&amp;nbsp;communicate 된다는 양방향으로 갈수 있다는 의미이다.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Reducibility&lt;/b&gt; : i 와 j 가 communicate 되고, i에 있는것이 모두 state의 일부이고, j에 있는 것이 모두 state의 일부일때 더이상 reduciible 할수 없는 markcov chanin 상태가 된다.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Transience&lt;/b&gt; : transient 는 다시 일어나지 않는다는 의미&amp;nbsp;&lt;/p&gt;
&lt;p&gt;어떤 state를 recurrnet 하다는 것은 특정 time 0에서 j라는 시스템이 있다고 두고, 나중에 많은 시간이 지난후에 j라는 것에 방문할 확률이 분명이 존재한다 ( 1이다. )는 의미이다.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Eragodicity : &lt;/b&gt;특정 state를 재방문하고 , 언제 일어날지 잘모르겠다는 의미&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bsYVjN/btqz74IMAZR/UaVsUJG4sDwOWLWnTWRmk0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bsYVjN/btqz74IMAZR/UaVsUJG4sDwOWLWnTWRmk0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bsYVjN/btqz74IMAZR/UaVsUJG4sDwOWLWnTWRmk0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbsYVjN%2Fbtqz74IMAZR%2FUaVsUJG4sDwOWLWnTWRmk0%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;Return time : 특정 time 0에서 i번 째 state 방문했다. 그다음번 i번째를 다시 방문하는 time을 return time 이라한다.&lt;/p&gt;
&lt;p&gt;&amp;pi; : 모든 state 마다 정의되는값. 특정 state에 system이 있을 확률 분포.&lt;/p&gt;
&lt;p&gt;markov chain 의 node에 정해진 stationary distribution 와 다른점이 무엇일까?&lt;/p&gt;
&lt;p&gt;이번 stationary distribution 에서 다음번 state로 transition 해도 똑같이 stationary distribution 이 된다.&lt;/p&gt;
&lt;p&gt;즉 &lt;span style=&quot;color: #333333;&quot;&gt;&amp;pi; * T = &lt;span style=&quot;color: #333333;&quot;&gt;&amp;pi;가 된다.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Reversible 한 MC 는 state i 에서 state j 로 transition 을 하고 반대로 &lt;span style=&quot;color: #333333;&quot;&gt;state j 에서 state i 로 &lt;/span&gt;transtion 하는 확률이 동일한것.&lt;/p&gt;</description>
      <category>머신러닝/문일철 교수님 강의 정리 (인공지능및기계학습개론)</category>
      <category>MarkovChain</category>
      <category>마르코프체인</category>
      <author>초코린</author>
      <guid isPermaLink="true">https://deadsquart.tistory.com/69</guid>
      <comments>https://deadsquart.tistory.com/69#entry69comment</comments>
      <pubDate>Mon, 2 Dec 2019 21:16:44 +0900</pubDate>
    </item>
    <item>
      <title>Week 10.3 Importance Sampling</title>
      <link>https://deadsquart.tistory.com/68</link>
      <description>&lt;p&gt;Rejection sampling을 보완하기 위해 만든것이 Importance sampling 이다.&lt;/p&gt;
&lt;p&gt;Sampling을 왜하냐? Expectation 을 계산하거나 , 확률값을 계산하기 위해 Sampling을 한다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;즉 , 위의 2가지만 잘할수 있으면 sampling을 무식하게 여러번하지 않아도된다. 그래서 도입된것인 Importance sampling 이다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/ceeMCF/btqz6gnClQJ/jLZfjjAddJS9kkUvQg1IFk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/ceeMCF/btqz6gnClQJ/jLZfjjAddJS9kkUvQg1IFk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/ceeMCF/btqz6gnClQJ/jLZfjjAddJS9kkUvQg1IFk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FceeMCF%2Fbtqz6gnClQJ%2FjLZfjjAddJS9kkUvQg1IFk%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;특정 함수에 대한 E(f)를 구해보자. f에 들어가는 random variable (z)이 있고, random variable을 generate 하는 확률분포 p(z)가 있을수 있다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/M5BHG/btqz4hVzfcw/zGrm8gOISSMWHdxnLfgtW0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/M5BHG/btqz4hVzfcw/zGrm8gOISSMWHdxnLfgtW0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/M5BHG/btqz4hVzfcw/zGrm8gOISSMWHdxnLfgtW0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FM5BHG%2Fbtqz4hVzfcw%2FzGrm8gOISSMWHdxnLfgtW0%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;위의 식의 제일 우항은 non equal 이다. 왜냐면 제일 우항을 제외하고는 무한대에 대한 확률을 다 계산하는것으로 나타내지만 현실은 불가능하다.&amp;nbsp; &amp;nbsp;현실은 개별 Instance를 samping 하고 개별 인스턴스의&amp;nbsp; 발생 확률을 구할수 있다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dLoUy8/btqz4KDhScx/Uirsd2YzkIcYnwyGfXLs4K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dLoUy8/btqz4KDhScx/Uirsd2YzkIcYnwyGfXLs4K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dLoUy8/btqz4KDhScx/Uirsd2YzkIcYnwyGfXLs4K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdLoUy8%2Fbtqz4KDhScx%2FUirsd2YzkIcYnwyGfXLs4K%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;특정 sample 이 1이상인 확률을 구한다고 해보자. =&amp;gt; P(Z&amp;gt;1)&amp;nbsp;&lt;/p&gt;
&lt;p&gt;P(z)가 1 이상인 case를 모두 summation 하면된다. 이것을 directly 계산이 쉽지않다. ( 적분이 쉽지않다. ) 그래서 sampling을 이용한다. q(z)의 sampling distribution을 활용하여 개별 q(z)을 이용해서 sampling을 여러번한다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;sampling의 하나하나 instance가 zl 이다. f는 1z&amp;gt;1 의 identity 함수이고, q는 zl을 활용했던 sampling distribution 값이고 , p(zl)은 sampling을 직접하지 않아도, probability evaluation 할수 있다는 가정 때문에 계산할수 있다. 뒤의 zl&amp;gt;1 부분은 1보다 크면 1 , 1보다 작으면 0으로 판단한다. 이렇게 sample을 통해 해결하게되는 과정을 importance sampling 이라 한다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;그러면 importance sampling을 discrte 한 domain 에서는 어떻게 하는지 알아보자.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bUJ51l/btqz6htlCuL/vKtpnYkgDHEX8iKrKP3uF1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bUJ51l/btqz6htlCuL/vKtpnYkgDHEX8iKrKP3uF1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bUJ51l/btqz6htlCuL/vKtpnYkgDHEX8iKrKP3uF1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbUJ51l%2Fbtqz6htlCuL%2FvKtpnYkgDHEX8iKrKP3uF1%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;sampling weight를 1로 두고 시작한다. coin toss를 통해 sampling을 하나하나 해 나간다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;0.001의 확률로 T가 나오는데 , 0.999의 확률로 F가 나왔다고 해보자. 여기서 sampling weight를 잡아준다. 지금까지의 확률이 얼마큼 important 하게 발생하느냐에 따라 weight를 잡아준다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;여기서 NormSW 는 앞의 numerical view의 1/L*q(zl)이라 볼수 있고, SumSW 는 p(zl) 로 볼수 있다.&lt;/p&gt;</description>
      <category>머신러닝/문일철 교수님 강의 정리 (인공지능및기계학습개론)</category>
      <category>importancesampling</category>
      <category>샘플링기반추론</category>
      <category>중요도샘플링</category>
      <author>초코린</author>
      <guid isPermaLink="true">https://deadsquart.tistory.com/68</guid>
      <comments>https://deadsquart.tistory.com/68#entry68comment</comments>
      <pubDate>Thu, 28 Nov 2019 22:24:00 +0900</pubDate>
    </item>
    <item>
      <title>Week 10.2 Rejection Sampling</title>
      <link>https://deadsquart.tistory.com/67</link>
      <description>&lt;p&gt;약간의 조건을 가지고 샘플링하는것.&lt;/p&gt;
&lt;p&gt;샘플링을 여러번 해야한다. (iteration 을 여러번 해야한다.)&lt;/p&gt;
&lt;p&gt;우선 rejection sampling을 discrete 관점에서 알아보자.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/d4gSH0/btqz4hUwaIh/3sKVcKlHsrGpvSb5PAHa3k/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/d4gSH0/btqz4hUwaIh/3sKVcKlHsrGpvSb5PAHa3k/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/d4gSH0/btqz4hUwaIh/3sKVcKlHsrGpvSb5PAHa3k/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fd4gSH0%2Fbtqz4hUwaIh%2F3sKVcKlHsrGpvSb5PAHa3k%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;Forwad sampling과 유사한 방식으로 유사하게 흘러간다.&lt;/p&gt;
&lt;p&gt;위에서 P(E=T|MC=T, A=F)를 구하는 과정에서 Alarm|B=F,E=T 라는 샘플링은 A=F 라는 given 에 맞지 않다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;그러면 &lt;span style=&quot;color: #333333;&quot;&gt;Alarm|B=F,E=T 이 sample은 쓸수 없게 된다. 즉 이 sample은 reject 하겠다. 그래서 reject sampling 이라한다.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Rejection sampling을 수치적 관점에서 알아보자.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;아래그림에서 p(x)는 우리가 sampling하고 싶은 확률분포이다. 우리가 잘 알고 있는 특정 distribution 에서 샘플링한다고 해보자.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;q(x)가 Normal 분포를 따른다고 해보자, 그리고 p(x)는 mixture distribution 이라해보자. p(x)를 감쌀수 있는 sampling distribution 을 만들어보자.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/djVRJ1/btqz4hUwGZb/9hWYZOfKfbOGj0i2V9d1JK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/djVRJ1/btqz4hUwGZb/9hWYZOfKfbOGj0i2V9d1JK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/djVRJ1/btqz4hUwGZb/9hWYZOfKfbOGj0i2V9d1JK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdjVRJ1%2Fbtqz4hUwGZb%2F9hWYZOfKfbOGj0i2V9d1JK%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;우리가 알아보려는 p(x)는 summation 하면 1이라는 제약조건을 가지고 있다. 그래서 높이가 한정되어 있다.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;그래서 이 최대 높이를 뛰어넘는 M을 곱해 distribution을 만들수 있다.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;normal distribution을 따르는 값을 하나 sampling 을 해볼수 있다. 그포인트를 xi라는 지점이었다고 해보자.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;이제 xi 라는 지점에서 이 sample을 받아 들일것인지 안 받아들인것인지 결정할수 있다.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;p(x)의 parameter 들은 우리가 알수 없지만 , evaluate는 할수 있다. 위 그림의 높이는 쉽게 알수 있다. (Xi ~A 의 높이)&lt;/p&gt;
&lt;p&gt;Normal Distribution 의 M을 곱한 값의 높이로 나누어 그 확률만큼을 가지고 sample을 accept 할것인지 reject 할것인지 결정하겠다는 것이 numerical view에서의 reject sampling 이다.&amp;nbsp; 위의 Rejection Region 이 Rejection 이 될것이다.&lt;/p&gt;
&lt;p&gt;M이 P(x)를 envelope 하지 않으면 Rejection sampling이 작동하지 않는다. 즉 M이 커져야한다. M이 커지면 Rejection 되는 region이 커진다. 그래서 sampling하는 시간상의 단점이 발생하게 된다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/BQTui/btqz4h74stB/gLuA2ZTKwCre9hiuxZMCyK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/BQTui/btqz4h74stB/gLuA2ZTKwCre9hiuxZMCyK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/BQTui/btqz4h74stB/gLuA2ZTKwCre9hiuxZMCyK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FBQTui%2Fbtqz4h74stB%2FgLuA2ZTKwCre9hiuxZMCyK%2Fimg.png&quot; data-origin-width=&quot;0&quot; data-origin-height=&quot;0&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;위의 왼쪽그림은 sampling을 NOrmal에서 M을 1/3을 주고,&amp;nbsp; Mixture 된 상태에서 sampling 한경우이다.&lt;/p&gt;
&lt;p&gt;위의 오른쪽그림은 sampling을 하나의 Normal에서 M을 3을 주고 sampling을 한 경우이다.&lt;/p&gt;
&lt;p&gt;오른쪽 그림의 제일 오른쪽 봉우리는 작은데 이것은 envelope가 잘되지 않은 경우이다. (Under sampling 된경우이다.)&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Rejection sampling도 여전히 계산 속도의 문제가 발생한다.&lt;/p&gt;</description>
      <category>머신러닝/문일철 교수님 강의 정리 (인공지능및기계학습개론)</category>
      <category>RejectionSampling</category>
      <category>기각샘플링</category>
      <author>초코린</author>
      <guid isPermaLink="true">https://deadsquart.tistory.com/67</guid>
      <comments>https://deadsquart.tistory.com/67#entry67comment</comments>
      <pubDate>Wed, 27 Nov 2019 21:04:52 +0900</pubDate>
    </item>
    <item>
      <title>Week 10.1 Forward Sampling</title>
      <link>https://deadsquart.tistory.com/66</link>
      <description>&lt;p&gt;지금 까지 EM을 통해 Parameter를 Inference 하는것을 배웠다.&lt;/p&gt;
&lt;p&gt;이번 주차는 다른 방법인 Sampling based inference를 통해 parameter를 inference 해보겠다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;기본적인 Sampling methods 에 대해 알아보자&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&amp;nbsp;&lt;/h4&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;1. Forward Sampling&lt;/b&gt;&lt;/h4&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/lhl1i/btqzLDpjqnc/ka4f6iu6nw8rDSnI7zx3V0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/lhl1i/btqzLDpjqnc/ka4f6iu6nw8rDSnI7zx3V0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/lhl1i/btqzLDpjqnc/ka4f6iu6nw8rDSnI7zx3V0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Flhl1i%2FbtqzLDpjqnc%2Fka4f6iu6nw8rDSnI7zx3V0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;topological order 에 따라 sample을 generate 할 수 있다.&lt;/p&gt;
&lt;p&gt;sample을 많이 만들고 , 알고 싶은 특정 확률값을 count를 한다.&amp;nbsp; 이 방법의 문제점은 무엇일까?&lt;/p&gt;
&lt;p&gt;1) Random case이기때문에 오차가 발생한다.&lt;/p&gt;
&lt;p&gt;2) 많은 시행을 반복할 수록 시간이 많이 걸린다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;그래서 위의 방법은 현실에서 사용하기 어렵다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/ck0LpP/btqzI3XsNgo/D7phkDA0jLume9cdxBVIP0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/ck0LpP/btqzI3XsNgo/D7phkDA0jLume9cdxBVIP0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/ck0LpP/btqzI3XsNgo/D7phkDA0jLume9cdxBVIP0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fck0LpP%2FbtqzI3XsNgo%2FD7phkDA0jLume9cdxBVIP0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;sampling based inference를 통해 즉정값을 찾아낸다. ( parameter를 찾아낸다.)&lt;/p&gt;
&lt;p&gt;많은 건수의 sampling을 통해 , histogram을 그려낸다.&lt;/p&gt;</description>
      <category>머신러닝/문일철 교수님 강의 정리 (인공지능및기계학습개론)</category>
      <category>forwardsampling</category>
      <category>samplingbased</category>
      <author>초코린</author>
      <guid isPermaLink="true">https://deadsquart.tistory.com/66</guid>
      <comments>https://deadsquart.tistory.com/66#entry66comment</comments>
      <pubDate>Wed, 13 Nov 2019 20:54:53 +0900</pubDate>
    </item>
    <item>
      <title>Week 9.5 Baum-Welch Algorithm</title>
      <link>https://deadsquart.tistory.com/65</link>
      <description>&lt;p&gt;이번 강의는 learning question 에 대해서 알아본다.&lt;/p&gt;
&lt;p&gt;X 만 주어지고 , &amp;pi; , a,b 에 대해서는 모른다. Training 을 위한 dataset 도 없다. X만 잔뜩 모아져 있다. 이럴때는 어떻게 해야하나 ?? Clustering 문제로 접근해 보겠다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Decoding question 은 supervised learning 에 가까웠다. Traiing case를 통해 parameter(&lt;span style=&quot;color: #333333;&quot;&gt;&amp;pi; , a,b)&lt;/span&gt;를 다 알고 있다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;b&gt;BAUM - WELCH Algorithm&lt;/b&gt;&lt;/h4&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/H1i05/btqzwYCnmaj/MxpO8xBWuUR9QnQY4oOG31/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/H1i05/btqzwYCnmaj/MxpO8xBWuUR9QnQY4oOG31/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/H1i05/btqzwYCnmaj/MxpO8xBWuUR9QnQY4oOG31/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FH1i05%2FbtqzwYCnmaj%2FMxpO8xBWuUR9QnQY4oOG31%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;현실에서 가장 많이 발생하는 문제는 X만 아는 경우이다. 그래서&lt;/p&gt;
&lt;p&gt;1)X를 통해 estimate 된 &lt;span style=&quot;color: #333333;&quot;&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;&amp;pi;_hat , a_hat,b_hat 를 찾아야한다.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;2) 가장 probable 한 z를 찾는다.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;이것을 알기위해서는 EM 알고리즘이 필요하다.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/NcsX8/btqzxKRgcAk/8oHKJYkIWPS36MpVMCRVu1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/NcsX8/btqzxKRgcAk/8oHKJYkIWPS36MpVMCRVu1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/NcsX8/btqzxKRgcAk/8oHKJYkIWPS36MpVMCRVu1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FNcsX8%2FbtqzxKRgcAk%2F8oHKJYkIWPS36MpVMCRVu1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;EM 알고리즘은 maximum likelihood solution을 찾는것인데, latent variables 이 있을때 찾아내는것이다.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;supervised learing 인 경우 z 없이 optimize가 가능하나, unspervised learning에서는 z가 필요하다.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;로그안의 &amp;Sigma;(시그마) 꼴이기 때문에 , EM algorithm을 이용한다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cFfuvY/btqzzJpLTby/zofsbWKxaU1bBv9pjEteG1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cFfuvY/btqzzJpLTby/zofsbWKxaU1bBv9pjEteG1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cFfuvY/btqzzJpLTby/zofsbWKxaU1bBv9pjEteG1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcFfuvY%2FbtqzzJpLTby%2FzofsbWKxaU1bBv9pjEteG1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/djWYUF/btqzwZ2lGcf/8RqTt9OTWRX74d936PPwkK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/djWYUF/btqzwZ2lGcf/8RqTt9OTWRX74d936PPwkK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/djWYUF/btqzwZ2lGcf/8RqTt9OTWRX74d936PPwkK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdjWYUF%2FbtqzwZ2lGcf%2F8RqTt9OTWRX74d936PPwkK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;&amp;pi;는&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;color: #333333;&quot;&gt;multinomial distribution 을 사용하였기 때문에 summation 은 1이라는 제약조건이 있다. 그래서&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;color: #333333;&quot;&gt;Lagrange method를 사용한다.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/HtXlD/btqzy7kkPin/KusXG9YaSBSDkIkVwO39G0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/HtXlD/btqzy7kkPin/KusXG9YaSBSDkIkVwO39G0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/HtXlD/btqzy7kkPin/KusXG9YaSBSDkIkVwO39G0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FHtXlD%2Fbtqzy7kkPin%2FKusXG9YaSBSDkIkVwO39G0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/yNI2q/btqzx6NfpPQ/r7GrVyHOrdAvjx77Ojwp20/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/yNI2q/btqzx6NfpPQ/r7GrVyHOrdAvjx77Ojwp20/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/yNI2q/btqzx6NfpPQ/r7GrVyHOrdAvjx77Ojwp20/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FyNI2q%2Fbtqzx6NfpPQ%2Fr7GrVyHOrdAvjx77Ojwp20%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>머신러닝/문일철 교수님 강의 정리 (인공지능및기계학습개론)</category>
      <author>초코린</author>
      <guid isPermaLink="true">https://deadsquart.tistory.com/65</guid>
      <comments>https://deadsquart.tistory.com/65#entry65comment</comments>
      <pubDate>Tue, 5 Nov 2019 22:50:37 +0900</pubDate>
    </item>
    <item>
      <title>Week 9.4 Viterbi Decoding Algorithm</title>
      <link>https://deadsquart.tistory.com/64</link>
      <description>&lt;p&gt;Decoding&amp;nbsp; 중 Viterbi Decoding 알고리즘이 많이 쓰인다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bJpFNI/btqzvonkKPG/LUVbyRMy4QudpY1uYMNUcK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bJpFNI/btqzvonkKPG/LUVbyRMy4QudpY1uYMNUcK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bJpFNI/btqzvonkKPG/LUVbyRMy4QudpY1uYMNUcK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbJpFNI%2FbtqzvonkKPG%2FLUVbyRMy4QudpY1uYMNUcK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;Forwad probability 와 Backward probablity&lt;/span&gt;는 특정 time t에서 latent fator가 특정 k 클러스터에 속할 joint 확률을 구하기 위해 Forwad probability 와 Backward probablity의 곱으로 나타낸다.&lt;/p&gt;
&lt;p&gt;이 구조는 재귀적 구조로 나타내진다. 이렇게 되면 특정 time t 의 latent variable에 대한 joint 가 된다는것이다. 이것은 conditional probability도 가능하다는 말이다. x가 given 일때 특정 time t의 latent variable에 대해서는 가장 most probable 한 assignment 를 할수 있겠다라는 내용이다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/qcXwr/btqzx7EenyT/Yo5NG83ESog6CMPhrl5VPK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/qcXwr/btqzx7EenyT/Yo5NG83ESog6CMPhrl5VPK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/qcXwr/btqzx7EenyT/Yo5NG83ESog6CMPhrl5VPK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FqcXwr%2Fbtqzx7EenyT%2FYo5NG83ESog6CMPhrl5VPK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;이것의 문제점은 single latent variable에 대한 assignment 라는 것이다. whole sequence를 보고 &lt;span style=&quot;color: #333333;&quot;&gt;whole sequence에 대한 latent variable에 대해서 assign 하고 싶은데, 이것을 decoding question 이라 한다. 즉 위의 그림에서 관측치 X를 알면 ( 빨간네모) , 2번의 노란색을 알수 있다. 그런데 알고싶은것은 보라색 네모의 전체 sequence (Z)이다. 이것을 decoding question 이라한다.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/X5Jy3/btqzyhNn4zR/tw47WlMKkLkLgeI12f2pKk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/X5Jy3/btqzyhNn4zR/tw47WlMKkLkLgeI12f2pKk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/X5Jy3/btqzyhNn4zR/tw47WlMKkLkLgeI12f2pKk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FX5Jy3%2FbtqzyhNn4zR%2Ftw47WlMKkLkLgeI12f2pKk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #333333;&quot;&gt;K *&amp;nbsp; : most probable assignment of 특정 클러스터 K, X라고 하는 전체 sequence가 given 인 상황에서 확률을 max인 상황을 한번 assignment 해보자라는것.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Vt k : x에 대해서 t-1 time 까지 (이전 time )의 whole sequence(x1,....xt-1 + z1,....zt-1) 와 지금 estimation 하려는 latent factor가 어떤 클러스터에 속하는냐(zkt=1)와 지금 관측되고 있는 정보(xt)가 관측될 확률을 maximize 하는 형태로 z1에서 zt-1 을 바꿔보겠다는 의미&lt;/p&gt;
&lt;p&gt;Viterbi 알고리즘도 repeating 구조이기 때문에 Dynamic program을 적용하여 전체 whole sequence 에 대한 most probable assignment를 얻게 된다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Dynamic program의 예시를 살펴보자.&lt;/p&gt;
&lt;p&gt;아래는 자동차공장의 조립라인이다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/z5d3g/btqzwv0nPqE/FneQkkagbaTqSJZGxCcWK0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/z5d3g/btqzwv0nPqE/FneQkkagbaTqSJZGxCcWK0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/z5d3g/btqzwv0nPqE/FneQkkagbaTqSJZGxCcWK0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fz5d3g%2Fbtqzwv0nPqE%2FFneQkkagbaTqSJZGxCcWK0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;각각원에 적힌것은 그 station에서의 소요시간이다. 화살표는 현재 station에서 그 다음 station으로 가는 경로를 나타낸다.&lt;/p&gt;
&lt;p&gt;어떤 station에서 조립하는것이 시간이 가장 적게 걸릴것인가라는 계산에 dynamic programming 이 사용된다.&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #ee2323;&quot;&gt;&lt;b&gt;우리가 구하려는 most probable assignment를 찾는것과 유사한 형태이다.&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/8Y4Lw/btqzxDDEI8u/jPERNHuX8L1qrByxKgK76k/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/8Y4Lw/btqzxDDEI8u/jPERNHuX8L1qrByxKgK76k/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/8Y4Lw/btqzxDDEI8u/jPERNHuX8L1qrByxKgK76k/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F8Y4Lw%2FbtqzxDDEI8u%2FjPERNHuX8L1qrByxKgK76k%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;위의 예실르 확률의 관점에서 다시한번 보자.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dYmwOy/btqzxMG6tOc/Z6XFwWvmY6YqCUoRk4kRR1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dYmwOy/btqzxMG6tOc/Z6XFwWvmY6YqCUoRk4kRR1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dYmwOy/btqzxMG6tOc/Z6XFwWvmY6YqCUoRk4kRR1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdYmwOy%2FbtqzxMG6tOc%2FZ6XFwWvmY6YqCUoRk4kRR1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;Vt k 는 [Time] 과 [state]의 곱 형태이다.&lt;/p&gt;
&lt;p&gt;trace에 대해 저장을 하고 특정 time t 까지의 확률을 계산해서 저장한다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Viterbi 알고리즘의 문젲점은 무엇이 있을까?&lt;/p&gt;
&lt;p&gt;time step이 커지면 곱셈양이 증가한다. (확률곱) =&amp;gt; 소수점이하로 떨어져서 0으로 인식한다. (underflow problem) =&amp;gt; 그래서 log domain으로 계산을 한다.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Viterbi 알고리즘을 통해  ,  ,  , X 가 주어진 상황에서 training하고 새로 sequence가 들어오면 most probable assignment를 할수 있다.&lt;/p&gt;</description>
      <category>머신러닝/문일철 교수님 강의 정리 (인공지능및기계학습개론)</category>
      <author>초코린</author>
      <guid isPermaLink="true">https://deadsquart.tistory.com/64</guid>
      <comments>https://deadsquart.tistory.com/64#entry64comment</comments>
      <pubDate>Mon, 4 Nov 2019 21:22:38 +0900</pubDate>
    </item>
    <item>
      <title>Week 9.3 Forward-Backward probability Calculation</title>
      <link>https://deadsquart.tistory.com/63</link>
      <description>&lt;h4 data-ke-size=&quot;size20&quot;&gt;Dynamic Programming&lt;/h4&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/QybYk/btqy66UPXFi/JlmY0oH6xuJkBihHHyksZ0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/QybYk/btqy66UPXFi/JlmY0oH6xuJkBihHHyksZ0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/QybYk/btqy66UPXFi/JlmY0oH6xuJkBihHHyksZ0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FQybYk%2Fbtqy66UPXFi%2FJlmY0oH6xuJkBihHHyksZ0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/WK758/btqy7Jrkd8K/yB3l6rOTjY4YnXL52SHGk0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/WK758/btqy7Jrkd8K/yB3l6rOTjY4YnXL52SHGk0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/WK758/btqy7Jrkd8K/yB3l6rOTjY4YnXL52SHGk0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FWK758%2Fbtqy7Jrkd8K%2FyB3l6rOTjY4YnXL52SHGk0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;Recursion : Top down 방식&lt;/p&gt;
&lt;p&gt;Dynamic programming : Botoom Up 방식 , 계산치를 미리 기억해두는것을 Memoization table이라 한다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dDvwo2/btqy6TnQxHc/AyaY01nGg2IG7kle4Ma4t1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dDvwo2/btqy6TnQxHc/AyaY01nGg2IG7kle4Ma4t1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dDvwo2/btqy6TnQxHc/AyaY01nGg2IG7kle4Ma4t1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdDvwo2%2Fbtqy6TnQxHc%2FAyaY01nGg2IG7kle4Ma4t1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;z2 결정함에 있어, x3를 안쓰냐? 라 물어 볼수 있다. 그래서 필요한것이 Backward probability 이다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/Y1coX/btqy8h9aCVv/cKb7A9WhljI1W7VP3Lelvk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/Y1coX/btqy8h9aCVv/cKb7A9WhljI1W7VP3Lelvk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/Y1coX/btqy8h9aCVv/cKb7A9WhljI1W7VP3Lelvk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FY1coX%2Fbtqy8h9aCVv%2FcKb7A9WhljI1W7VP3Lelvk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;Backward probability는 Whole X sequence를 생각하는 상황에서 특정 time point 에서 latent factor가 어떻게 assign 되는지 확률적으로 알아보고 싶다는 질문에서 출발하였다.&lt;/p&gt;
&lt;p&gt;x3는 z2만 알고 있다면 앞에는 (x1,x2) 다 몰라도 된다.&lt;/p&gt;</description>
      <category>머신러닝/문일철 교수님 강의 정리 (인공지능및기계학습개론)</category>
      <category>backwardalgorithm</category>
      <category>forwardalgorithm</category>
      <category>forwardbackward</category>
      <author>초코린</author>
      <guid isPermaLink="true">https://deadsquart.tistory.com/63</guid>
      <comments>https://deadsquart.tistory.com/63#entry63comment</comments>
      <pubDate>Thu, 17 Oct 2019 22:30:09 +0900</pubDate>
    </item>
  </channel>
</rss>