Programação em lógica indutiva

Programação em lógica indutiva (ILP) é uma subárea de aprendizado de máquina que utiliza lógica de programação como uma representação uniforme para exemplos, conhecimentos prévios e hipóteses. Dada uma codificação do conhecimento prévio e um conjunto de exemplos representados como um banco de dados lógico de fatos, um sistema ILP irá derivar um programa de lógica hipotetizado que envolve todos os exemplos positivos e nenhum dos exemplos negativos.

Esquema: exemplos positivos + exemplos negativos + conhecimento prévio ⇒ hipótese.

Programação em lógica indutiva é particularmente útil em bioinformática e processamento de linguagem natural. Gordon Plotkin e Ehud Shapiro definiram a fundamentação teórica inicial para aprendizagem de máquina indutiva sob um ponto de vista lógico.^[1]^[2]^[3] Shapiro construiu sua primeira implementação em 1981:^[4] um programa em Prolog que indutivamente inferia programas lógicos a partir de exemplos positivos e exemplos negativos. O termo Programação em lógica indutiva foi introduzido pela primeira vez^[5] em um artigo publicado por Stephen Muggleton, em 1991.^[6] Muggleton também fundou a conferência internacional sobre Programação em lógica indutiva, introduziu as idéias teóricas de Invenção de Predicado, Resolução inversa,^[7] e Implicação Inversa.^[8] Muggleton implementou Implicação Inversa primeiramente no sistema PROGOL. O termo "indutivo" aqui refere-se ao filosófico (por exemplo, sugerindo uma teoria para explicar fatos observados), ao invés do matemático (por exemplo, a prova de propriedade para todos os membros de um conjunto ordenado).

Definição Formal[editar | editar código-fonte]

O conhecimento de background é dado como uma teoria lógica B, comumente na forma de cláusulas de Horn usado em lógica de programação. Os exemplos positivos e o negativos são fornecidos como uma conjunção de $E^{+}$ e ${\displaystyle E^{-}}$ de literais não-negados e negados, respectivamente. Uma hipótese correta h é uma proposição lógica que satisfaz os seguintes requisitos.^[9]

{\begin{array}{llll}{\text{Necessidade:}}&B&\not \models &E^{+}\\{\text{Suficiência:}}&B\land h&\color {blue}{\models }&E^{+}\\{\text{Consistência Fraca:}}&B\land h&\not \models &{\textit {false}}\\{\text{Consistência Forte:}}&B\land h\land E^{-}&\not \models &{\textit {false}}\end{array}}

"Necessidade" não impõe uma restrição sobre h, mas proíbe qualquer geração de uma hipótese, enquanto os fatos positivos são explicáveis sem ela. "Suficiência" requer que qualquer hipótese gerada h explique todos os exemplos positivos $E^{+}$ ."A Consistência fraca" proíbe a geração de qualquer hipótese h que contradiz o conhecimento prévio B. "Consistência forte" também proíbe a geração de qualquer hipótese h que é inconsistente com os exemplos negativos $E^{-}$ , dado o conhecimento prévio B; isso implica "Consistência fraca"; se nenhum exemplo negativo é dado, ambas as exigências coincidem. Džeroski^[10] exige apenas "Suficiência" (chamado de "Completude" lá) e "Consistência forte".

Exemplo[editar | editar código-fonte]

Relações familiares assu na secção "Exemplo"

O seguinte exemplo bem conhecido sobre o aprendizado de definições das relações familiares usa as abreviações:

{\textit {par}}:{\textit {pai}}

,

{\textit {fem}}:{\textit {feminino}}

,

{\textit {dau}}:{\textit {filha}}

,

g:{\textit {George}}

,

h:{\textit {Helen}}

,

m:{\textit {Mary}}

,

t:{\textit {Tom}}

,

n:{\textit {Nancy}}

, and

e:{\textit {Eve}}

.

Ele começa a partir do conhecimento prévio (imagem)

{\textit {par}}(h,m)\land {\textit {par}}(h,t)\land {\textit {par}}(g,m)\land {\textit {par}}(t,e)\land {\textit {par}}(n,e)\land {\textit {fem}}(h)\land {\textit {fem}}(m)\land {\textit {fem}}(n)\land {\textit {fem}}(e)

,

dos exemplos positivos

{\textit {dau}}(m,h)\land {\textit {dau}}(e,t)

,

e da proposição trivial

{\textit {true}}

para indicar a ausência de exemplos negativos.

A abordagem de "generalização relativa menos geral (rlgg)" de Plotkin^[11]^[12] para Programação em Lógica Indutiva deve ser utilizada para obter uma sugestão sobre como definir formalmente a relação filha ${\textit {dau}}$ .

Esta abordagem utiliza os seguintes passos.

Relativizar cada exemplo de literal positivo com o conhecimento prévio completo:
${\begin{aligned}{\textit {dau}}(m,h)\leftarrow {\textit {par}}(h,m)\land {\textit {par}}(h,t)\land {\textit {par}}(g,m)\land {\textit {par}}(t,e)\land {\textit {par}}(n,e)\land {\textit {fem}}(h)\land {\textit {fem}}(m)\land {\textit {fem}}(n)\land {\textit {fem}}(e)\\{\textit {dau}}(e,t)\leftarrow {\textit {par}}(h,m)\land {\textit {par}}(h,t)\land {\textit {par}}(g,m)\land {\textit {par}}(t,e)\land {\textit {par}}(n,e)\land {\textit {fem}}(h)\land {\textit {fem}}(m)\land {\textit {fem}}(n)\land {\textit {fem}}(e)\end{aligned}}$ ,
Converter para forma normal conjuntiva:
${\begin{aligned}{\textit {dau}}(m,h)\lor \lnot {\textit {par}}(h,m)\lor \lnot {\textit {par}}(h,t)\lor \lnot {\textit {par}}(g,m)\lor \lnot {\textit {par}}(t,e)\lor \lnot {\textit {par}}(n,e)\lor \lnot {\textit {fem}}(h)\lor \lnot {\textit {fem}}(m)\lor \lnot {\textit {fem}}(n)\lor \lnot {\textit {fem}}(e)\\{\textit {dau}}(e,t)\lor \lnot {\textit {par}}(h,m)\lor \lnot {\textit {par}}(h,t)\lor \lnot {\textit {par}}(g,m)\lor \lnot {\textit {par}}(t,e)\lor \lnot {\textit {par}}(n,e)\lor \lnot {\textit {fem}}(h)\lor \lnot {\textit {fem}}(m)\lor \lnot {\textit {fem}}(n)\lor \lnot {\textit {fem}}(e)\end{aligned}}$ ,
Antiunificar cada par compatível ^[13] ^[14] de literais:
- ${\textit {dau}}(x_{me},x_{ht})$ de ${\textit {dau}}(m,h)$ e ${\textit {dau}}(e,t)$ ,
- $\lnot {\textit {par}}(x_{ht},x_{me})$ de $\lnot {\textit {par}}(h,m)$ e $\lnot {\textit {par}}(t,e)$ ,
- $\lnot {\textit {fem}}(x_{me})$ de $\lnot {\textit {fem}}(m)$ e $\lnot {\textit {fem}}(e)$ ,
- $\lnot {\textit {par}}(g,m)$ de $\lnot {\textit {par}}(g,m)$ e $\lnot {\textit {par}}(g,m)$ , similar para qualquer outro literal de conhecimento prévio:
- $\lnot {\textit {par}}(x_{gt},x_{me})$ de $\lnot {\textit {par}}(g,m)$ e $\lnot {\textit {par}}(t,e)$ , e muitos mais literais negados
Excluir todos os literais negados contendo variáveis que não ocorrem em um literal positivo:
- Após excluir todos os literais negados contendo outras variáveis além de $x_{me},x_{ht}$ , somente ${\textit {dau}}(x_{me},x_{ht})\lor \lnot {\textit {par}}(x_{ht},x_{me})\lor \lnot {\textit {fem}}(x_{me})$ resta, juntamente com todos os literais que vieram do conhecimento prévio
Converter cláusulas de volta para a forma de Horn:
- ${\textit {dau}}(x_{me},x_{ht})\leftarrow {\textit {par}}(x_{ht},x_{me})\land {\textit {fem}}(x_{me})\land ({\text{todos os fatos do conhecimento prévio}})$

A cláusula de Horn resultante é a hipótese h obtida pela abordagem rlgg. Ignorando os fatos do conhecimento prévio, a cláusula informalmente lê " $x_{me}$ é chamada de uma filha de $x_{ht}$ se $x_{ht}$ é o pai de $x_{me}$ e $x_{me}$ é feminina", que é uma definição comumente aceita.

Sobre os requisitos acima, "Necessidade" estava satisfeita porque o predicado ${\textit {dau}}$ não aparece no conhecimento prévio, o que, portanto, não implica qualquer propriedade que contém esse predicado, tal como os exemplos positivos. "Suficiência" é satisfeita pela hipótese $h$ , pois, ela juntamente com ${\textit {par}}(h,m)\land {\textit {fem}}(m)$ a partir do conhecimento prévio, implica o primeiro exemplo positivo ${\textit {dau}}(m,h)$ , e da mesma forma $h$ e ${\textit {par}}(t,e)\land {\textit {fem}}(e)$ a partir do conhecimento prévio implica o segundo exemplo positivo ${\textit {dau}}(e,t)$ . "A Consistência fraca" é satisfeita por $h$ , pois $h$ detém a estrutura de Herbrand (finita) descrita pelo conhecimento prévio; semelhante para o "Consistência forte".

A definição comum da relação avó, ${\textit {gra}}(x,z)\leftarrow {\textit {fem}}(x)\land {\textit {par}}(x,y)\land {\textit {par}}(y,z)$ , não pode ser aprendida através da abordagem acima, uma vez que a variável y ocorre somente na cláusula corpo; os literais correspondentes teriam sido eliminados na 4ª etapa da abordagem. Para superar essa falha, que passo tem que ser modificada de tal forma que possa ser parametrizada com diferentes heurística de pós-seleção de literais. Historicamente, a implementação GOLEM é baseada na abordagem rlgg.

Sistema de Programação em Lógica Indutiva[editar | editar código-fonte]

Systema de Programação em Lógica Indutiva é um programa que toma como entrada teorias lógicas $B,E^{+},E^{-}$ e retorna uma hipótese correta $H$ em relação a essas teorias. Um algoritmo de um sistema de ILP é composto de duas partes: a hipótese de pesquisa e hipótese de seleção. Primeiro uma hipótese é pesquisada com um método de Programação em Lógica Indutiva, em seguida, um subconjunto das hipóteses encontradas (na maioria dos sistemas, uma hipótese) é escolhido por um algoritmo de seleção. Um algoritmo de seleção de pontua cada um das hipóteses e devolve aquelas com a maior pontuação. Um exemplo de função de pontuação inclue compactação mínima de comprimento onde de uma hipótese com uma menor complexidade de Kolmogorov tem a pontuação mais alta e é devolvida. Um sistema de ILP é completa se e somente se para qualquer entrada de teorias lógicas $B,E^{+},E^{-}$ qualquer hipótese correta $H$ em relação a estas teorias pode ser encontrad com seu método de pesquisa de hipótese.

Pesquisa de Hipótese[editar | editar código-fonte]

Modernos sistemas de ILP como Progol,^[6] Hail,^[15] e Imparo^[16] encontram uma hipótese $H$ , utilizando o princípio da implicação inversa^[6] para as teorias $B$ , $E$ , $H$ : $B\land H\models E\iff B\land \neg E\models \neg H$ Primeiro eles constroem uma teoria intermediária F chamada de uma teoria ponte que satisfaça as condições $B\land \neg E\models F$ and $F\models \neg H$ . Em seguida, como $H\models \neg F$ , eles generalizam a negação da teoria ponte F com a anti-implicação.^[17] No entanto, a operação de anti-implicação, sendo altamente não determinística, é computacionalmente mais cara. Portanto, uma pesquisa de hipótese alternativa pode ser conduzida usando a operação da inversa de classificação (antisssubsunção), que é menos não determinística que anti-implicação.

Perguntas sobre completude de uma pesquisa de hipótese específica de um sistema ILP de surgiram. Por exemplo, Progol o método de pesquisa de hipótese com base na regra de inferência da implicação inversa não é completa pelo exemplo de Yamamoto .^[18] por outro lado, Imparo é completa, tanto para o método de anti-implicação ^[19] quanto para o método de classificação inversa estendida ^[20]

Implementações[editar | editar código-fonte]

1BC e 1BC2: de primeira ordem naive Bayesian classificadores: (http://www.cs.bris.ac.uk/Research/MachineLearning/1BC/)
ACE (UM Combinado do Motor) (http://dtai.cs.kuleuven.be/ACE/)
Aleph (http://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/)
Átomo (http://www.ahlgren.info/research/atom/)
Claudien (http://dtai.cs.kuleuven.be/claudien/)
DL-Aluno (http://dl-learner.org)
(Dmáxhttp://dtai.cs.kuleuven.be/dmax/)
FOLHA (ftp://ftp.cs.su.oz.au/pub/foil6.sh^{[ligação inativa]})
Golem (ILP) (http://www.doc.ic.ac.uk/~shm/Software/golem)
Imparo^[19]
Inthelex (INcremental Teoria de Aprendiz, a partir de Exemplos)http://lacam.di.uniba.it:8000/systems/inthelex/)
Limão:https://web.archive.org/web/20020516195248/http://cs.anu.edu.au/people/Eric.McCreath/lime.html)
Metagol (http://github.com/metagol/metagol)
Mio (http://libra.msra.cn/Publication/3392493/mio-user-s-manual^{[ligação inativa]})
MIS (Modelo de Inferência do Sistema) por Ehud Shapiro
PROGOL (http://www.doc.ic.ac.uk/~shm/Software/progol5.0)
RSD (https://web.archive.org/web/20070301162526/http://labe.felk.cvut.cz/~zelezny/rsd/)
Tércio (http://www.cs.bris.ac.uk/publications/Papers/1000545.pdf)
Warmr (agora incluído no ACE)
ProGolem (http://ilp.doc.ic.ac.uk/ProGolem/) ^[21]^[22]

Veja também[editar | editar código-fonte]

Raciocínio de senso comum
Conceito de análise formal
Método Indutivo
Raciocínio indutivo
Programação Indutiva
Probabilidade Indutiva
Estatísticas relacionais de aprendizagem
Versão espaço de aprendizagem

Referências[editar | editar código-fonte]

↑ Plotkin G.D. Automatic Methods of Inductive Inference, PhD thesis, University of Edinburgh, 1970.
↑ Shapiro, Ehud Y. Inductive inference of theories from facts, Research Report 192, Yale University, Department of Computer Science, 1981. Reprinted in J.-L. Lassez, G. Plotkin (Eds.), Computational Logic, The MIT Press, Cambridge, MA, 1991, pp. 199–254.
↑ Shapiro, Ehud Y. (1983). Algorithmic program debugging. Cambridge, Mass: MIT Press. ISBN 0-262-19218-7
↑ Shapiro, Ehud Y. "The model inference system." Proceedings of the 7th international joint conference on Artificial intelligence-Volume 2. Morgan Kaufmann Publishers Inc., 1981.
↑ Luc De Raedt. A Perspective on Inductive Logic Programming. The Workshop on Current and Future Trends in Logic Programming, Shakertown, to appear in Springer LNCS, 1999. CiteSeerX: 10.1.1.56.1790
↑ ^a ^b ^c Muggleton, S.H. (1991). «Inductive logic programming». New Generation Computing. 8 (4): 295–318. doi:10.1007/BF03037089
↑ Muggleton S.H. and Buntine W. "Machine invention of first-order predicate by inverting resolution","Proceedings of the 5th International Conference on Machine Learning, 1988.
↑ Muggleton S.H., "Inverting entailment and Progol", New Generation Computing, 13:245-286, 1995.
↑ Muggleton, Stephen (1999). «Inductive Logic Programming: Issues, Results and the Challenge of Learning Language in Logic». Artificial Intelligence. 114: 283–296. doi:10.1016/s0004-3702(99)00067-3 ; here: Sect.2.1
↑ Džeroski, Sašo (1996), «Inductive Logic Programming and Knowledge Discovery in Databases», in: Fayyad, U.M.; Piatetsky-Shapiro, G.; Smith, P.; Uthurusamy, R., Advances in Knowledge Discovery and Data Mining, MIT Press, pp. 117–152 ; here: Sect.5.2.4
↑ Plotkin, Gordon D. (1970). Meltzer, B.; Michie, D., eds. «A Note on Inductive Generalization». Edinburgh University Press. Machine Intelligence. 5: 153–163
↑ Plotkin, Gordon D. (1971). Meltzer, B.; Michie, D., eds. «A Further Note on Inductive Generalization». Edinburgh University Press. Machine Intelligence. 6: 101–124
↑ i.e. sharing the same predicate symbol and negated/unnegated status
↑ in general: n-tuple when n positive example literals are given
↑ Ray, O., Broda, K., & Russo, A. M. (2003). Hybrid abductive inductive learning. In LNCS: Vol. 2835. Proceedings of the 13th international conference on inductive logic programming (pp. 311–328). Berlin: Springer.
↑ Kimber, T., Broda, K., & Russo, A. (2009). Induction on failure: learning connected Horn theories. In LNCS: Vol. 5753. Proceedings of the 10th international conference on logic programing and nonmonotonic reasoning (pp. 169–181). Berlin: Springer.
↑ Yoshitaka Yamamoto, Katsumi Inoue, and Koji Iwanuma. Inverse subsumption for complete explanatory induction. Machine learning, 86(1):115–139, 2012.
↑ Akihiro Yamamoto. Which hypotheses can be found with inverse entailment? In Inductive Logic Programming, pages 296–308. Springer, 1997.
↑ ^a ^b Timothy Kimber. Learning definite and normal logic programs by induction on failure. PhD thesis, Imperial College London, 2012.
↑ David Toth (2014). Imparo is complete by inverse subsumption. arXiv:1407.3836
↑ Muggleton, Stephen; Santos, Jose; Tamaddoni-Nezhad, Alireza (2009). «ProGolem: a system based on relative minimal generalization» (PDF). ILP
↑ Santos, Jose; Nassif, Houssam; Page, David; Muggleton, Stephen; Sternberg, Mike (2012). «Automated identification of features of protein-ligand interactions using Inductive Logic Programming: a hexose binding case study» (PDF). BMC Bioinformatics. 13: 162. doi:10.1186/1471-2105-13-162

Ler mais[editar | editar código-fonte]

Muggleton, S.; De Raedt, L. (1994). «Inductive Logic Programming: Theory and methods». The Journal of Logic Programming. 19-20: 629–679. doi:10.1016/0743-1066(94)90035-3
Lavrac, N.; Dzeroski, S. (1994). Inductive Logic Programming: Techniques and Applications. New York: Ellis Horwood. ISBN 0-13-457870-8. Consultado em 6 de dezembro de 2016. Arquivado do original em 6 de setembro de 2004
Visual example of inducing the grandparenthood relation by the Atom system. http://john-ahlgren.blogspot.com/2014/03/inductive-reasoning-visualized.html

[1] Plotkin G.D. Automatic Methods of Inductive Inference, PhD thesis, University of Edinburgh, 1970.

[2] Shapiro, Ehud Y. Inductive inference of theories from facts, Research Report 192, Yale University, Department of Computer Science, 1981. Reprinted in J.-L. Lassez, G. Plotkin (Eds.), Computational Logic, The MIT Press, Cambridge, MA, 1991, pp. 199–254.

[3] Shapiro, Ehud Y. (1983). Algorithmic program debugging. Cambridge, Mass: MIT Press. ISBN 0-262-19218-7

[4] Shapiro, Ehud Y. "The model inference system." Proceedings of the 7th international joint conference on Artificial intelligence-Volume 2. Morgan Kaufmann Publishers Inc., 1981.

[5] Luc De Raedt. A Perspective on Inductive Logic Programming. The Workshop on Current and Future Trends in Logic Programming, Shakertown, to appear in Springer LNCS, 1999. CiteSeerX: 10.1.1.56.1790

[muggleton1995inverse-6] Muggleton, S.H. (1991). «Inductive logic programming». New Generation Computing. 8 (4): 295–318. doi:10.1007/BF03037089

[7] Muggleton S.H. and Buntine W. "Machine invention of first-order predicate by inverting resolution","Proceedings of the 5th International Conference on Machine Learning, 1988.

[8] Muggleton S.H., "Inverting entailment and Progol", New Generation Computing, 13:245-286, 1995.

[9] Muggleton, Stephen (1999). «Inductive Logic Programming: Issues, Results and the Challenge of Learning Language in Logic». Artificial Intelligence. 114: 283–296. doi:10.1016/s0004-3702(99)00067-3 ; here: Sect.2.1

[10] Džeroski, Sašo (1996), «Inductive Logic Programming and Knowledge Discovery in Databases», in: Fayyad, U.M.; Piatetsky-Shapiro, G.; Smith, P.; Uthurusamy, R., Advances in Knowledge Discovery and Data Mining, MIT Press, pp. 117–152 ; here: Sect.5.2.4

[11] Plotkin, Gordon D. (1970). Meltzer, B.; Michie, D., eds. «A Note on Inductive Generalization». Edinburgh University Press. Machine Intelligence. 5: 153–163

[12] Plotkin, Gordon D. (1971). Meltzer, B.; Michie, D., eds. «A Further Note on Inductive Generalization». Edinburgh University Press. Machine Intelligence. 6: 101–124

[13] .e. sharing the same predicate symbol and negated/unnegated status

[14] ral: n-tuple when n positive example literals are given

[15] Ray, O., Broda, K., & Russo, A. M. (2003). Hybrid abductive inductive learning. In LNCS: Vol. 2835. Proceedings of the 13th international conference on inductive logic programming (pp. 311–328). Berlin: Springer.

[16] Kimber, T., Broda, K., & Russo, A. (2009). Induction on failure: learning connected Horn theories. In LNCS: Vol. 5753. Proceedings of the 10th international conference on logic programing and nonmonotonic reasoning (pp. 169–181). Berlin: Springer.

[17] Yoshitaka Yamamoto, Katsumi Inoue, and Koji Iwanuma. Inverse subsumption for complete explanatory induction. Machine learning, 86(1):115–139, 2012.

[18] Akihiro Yamamoto. Which hypotheses can be found with inverse entailment? In Inductive Logic Programming, pages 296–308. Springer, 1997.

[kimber2009induction-19] Timothy Kimber. Learning definite and normal logic programs by induction on failure. PhD thesis, Imperial College London, 2012.

[20] David Toth (2014). Imparo is complete by inverse subsumption. arXiv:1407.3836

[21] Muggleton, Stephen; Santos, Jose; Tamaddoni-Nezhad, Alireza (2009). «ProGolem: a system based on relative minimal generalization» (PDF). ILP

[22] Santos, Jose; Nassif, Houssam; Page, David; Muggleton, Stephen; Sternberg, Mike (2012). «Automated identification of features of protein-ligand interactions using Inductive Logic Programming: a hexose binding case study» (PDF). BMC Bioinformatics. 13: 162. doi:10.1186/1471-2105-13-162

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]