From MFKP_wiki

Jump to: navigation, search

Selection: with tag machine-learning [29 articles] 


Resampling methods for meta-model validation with recommendations for evolutionary computation

Evolutionary Computation, Vol. 20, No. 2. (16 February 2012), pp. 249-275,


Meta-modeling has become a crucial tool in solving expensive optimization problems. Much of the work in the past has focused on finding a good regression method to model the fitness function. Examples include classical linear regression, splines, neural networks, Kriging and support vector regression. This paper specifically draws attention to the fact that assessing model accuracy is a crucial aspect in the meta-modeling framework. Resampling strategies such as cross-validation, subsampling, bootstrapping, and nested resampling are prominent methods for model validation and ...


Combining multiple classifiers: an application using spatial and remotely sensed information for land cover type mapping

Remote Sensing of Environment, Vol. 74, No. 3. (December 2000), pp. 545-556,


This article discusses two new methods for increasing the accuracy of classifiers used land cover mapping. The first method, called the product rule, is a simple and general method of combining two or more classification rules as a single rule. Stacked regression methods of combining classification rules are discussed and compared to the product rule. The second method of increasing classifier accuracy is a simple nonparametric classifier that uses spatial information for classification. Two data sets used for land cover mapping ...


Bagging ensemble selection for regression

In AI 2012: Advances in Artificial Intelligence, Vol. 7691 (2012), pp. 695-706,


Bagging ensemble selection (BES) is a relatively new ensemble learning strategy. The strategy can be seen as an ensemble of the ensemble selection from libraries of models (ES) strategy. Previous experimental results on binary classification problems have shown that using random trees as base classifiers, BES-OOB (the most successful variant of BES) is competitive with (and in many cases, superior to) other ensemble learning strategies, for instance, the original ES algorithm, stacking with linear regression, random forests or boosting. Motivated by ...


Bagging ensemble selection

In AI 2011: Advances in Artificial Intelligence, Vol. 7106 (2011), pp. 251-260,


Ensemble selection has recently appeared as a popular ensemble learning method, not only because its implementation is fairly straightforward, but also due to its excellent predictive performance on practical problems. The method has been highlighted in winning solutions of many data mining competitions, such as the Netflix competition, the KDD Cup 2009 and 2010, the UCSD FICO contest 2010, and a number of data mining competitions on the Kaggle platform. In this paper we present a novel variant: bagging ensemble selection. ...


SoilGrids250m: Global gridded soil information based on machine learning

PLOS ONE, Vol. 12, No. 2. (16 February 2017), e0169748,


This paper describes the technical development and accuracy assessment of the most recent and improved version of the SoilGrids system at 250m resolution (June 2016 update). SoilGrids provides global predictions for standard numeric soil properties (organic carbon, bulk density, Cation Exchange Capacity (CEC), pH, soil texture fractions and coarse fragments) at seven standard depths (0, 5, 15, 30, 60, 100 and 200 cm), in addition to predictions of depth to bedrock and distribution of soil classes based on the World Reference ...


Estimating future burned areas under changing climate in the EU-Mediterranean countries

Science of The Total Environment, Vol. 450-451 (April 2013), pp. 209-222,


The impacts of climate change on forest fires have received increased attention in recent years at both continental and local scales. It is widely recognized that weather plays a key role in extreme fire situations. It is therefore of great interest to analyze projected changes in fire danger under climate change scenarios and to assess the consequent impacts of forest fires. In this study we estimated burned areas in the European Mediterranean (EU-Med) countries under past and future climate conditions. Historical ...


Random forests

Machine Learning, Vol. 45, No. 1. (2001), pp. 5-32,


Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random ...


(INRMM-MiD internal record) List of keywords of the INRMM meta-information database - part 20

(February 2014)
Keywords: inrmm-list-of-tags   liquidambar-styraciflua   liriodendron-spp   liriodendron-tulipifera   list   literate-programming   lithocarpus-densiflorus   lithocarpus-glaber   lithocarpus-spp   lithuania   litter   local   local-average-invariance   local-over-complication   local-scale   lodoicea-maldivica   logging   logic-programming   logics   logistic-regression   lognormal-distribution   long-distance-dispersal   long-distance-pollen-flow   long-lived-changes   long-range-transport   long-term   lonicera-alpigena   lonicera-caerulea   lonicera-nigra   lonicera-periclymenum   lonicera-spp   lonicera-tatarica   lonicera-xylosteum   loranthus-europaeus   lose-lose-solution   low-diversity   low-intensity-agriculture   low-intensity-cumulated-effect   low-pass-filtering   lpj-guess   lucanidae   lupinus-incana   lupinus-spp   lymantria-dispar   lymantria-monacha   lyonothamnus-floribundus   lysiloma-latisiliquum   macchia   macedonia   machine-learning   maclura-spp   macro-remains   macroclimate   macroecology   macrofossils   macropsis-glandacea   maghreb   magnolia-acuminata   magnolia-grandiflora   magnoliophyta   mahalanobis-distance   mahonia-spp   malta   malus-crescimannoi   malus-dasyphylla   malus-pumila   malus-spp   malus-sylvestris   mammals   mammea-americana   management   management-indicators   management-strategies   manganese   mangifera-indica   mangrove-forest   mangroves   manifesto   manilkara-zapota   manual   manual-cutting   maple   maple-ash   maple-decline   maple-linden   mapping   mapping-networks   maps   maquis   marchalina   marginal-populations   marine-ecosystem   marssonina-betulae   mass-extinction   mass-spectrometry   mast-fruiting   mastixioideae   mastrave-modelling-library   mathematical-reasoning   mathematics  


List of indexed keywords within the transdisciplinary set of domains which relate to the Integrated Natural Resources Modelling and Management (INRMM). In particular, the list of keywords maps the semantic tags in the INRMM Meta-information Database (INRMM-MiD). [\n] The INRMM-MiD records providing this list are accessible by the special tag: inrmm-list-of-tags ( ). ...


Predicting habitat suitability with machine learning models: The potential area of Pinus sylvestris L. in the Iberian Peninsula

Ecological Modelling, Vol. 197, No. 3-4. (August 2006), pp. 383-393,


We present a modelling framework for predicting forest areas. The framework is obtained by integrating a machine learning software suite within the GRASS Geographical Information System (GIS) and by providing additional methods for predictive habitat modelling. Three machine learning techniques (Tree-Based Classification, Neural Networks and Random Forest) are available in parallel for modelling from climatic and topographic variables. Model evaluation and parameter selection are measured by sensitivity-specificity ROC analysis, while the final presence and absence maps are obtained through maximisation of ...


Does the interpolation accuracy of species distribution models come at the expense of transferability?

Ecography, Vol. 35, No. 3. (March 2012), pp. 276-288,


Model transferability (extrapolative accuracy) is one important feature in species distribution models, required in several ecological and conservation biological applications. This study uses 10 modelling techniques and nationwide data on both (1) species distribution of birds, butterflies, and plants and (2) climate and land cover in Finland to investigate whether good interpolative prediction accuracy for models comes at the expense of transferability – i.e. markedly worse performance in new areas. Models’ interpolation and extrapolation performance was primarily assessed using AUC (the ...


The Neighbor Search approach applied to reservoir optimal operation: the Hoa Binh case study

No. hal-00698200. (2006)


[Conclusion] The focus of this thesis is to show, through a real-case application, how the NS can be useful to improve the decision-making process for multi-objectives reservoir operation planning. After a survey of the principal techniques employed in literature to solve such problems, the NS algorithm has been discussed. Further, the case-study has been presented. Hoa Binh is the largest reservoir in Vietnam, providing 40% of its total power supplies and protecting the capital Hanoi from major flooding events. This double purpose generates conflicts in its management ...


  1. Arnold, E., Tatjewski, P., Wolochowicz, P., 1994. Two methods for large-scale nonlinear optimization and their comparison on a case study of hydropower optimization. Journal of Optimization Theory Applications 81 (2), 221–248.
  2. Back, T., Fogel, D. B., Michalewicz, Z. (Eds.), 1997. Handbook of Evolutionary Computation.
  3. Bristol, New York: Institute of Physics Publishing and Oxford University Press.
  4. Barros, M., Tsai, F., Yang, S., Yeh, W., 2003. Optimization of large-scale hydropower

Research priorities for robust and beneficial artificial intelligence

(January 2015)


[Executive Summary] Success in the quest for artificial intelligence has the potential to bring unprecedented benefits to humanity, and it is therefore worthwhile to research how to maximize these benefits while avoiding potential pitfalls. This document gives numerous examples (which should by no means be construed as an exhaustive list) of such worthwhile research aimed at ensuring that AI remains robust and beneficial. [Research Priorities for Robust and Beneficial Artificial Intelligence: an Open Letter] Artificial intelligence (AI) research has explored a variety of problems and approaches since ...


Neural Turing machines

(10 Dec 2014)


We extend the capabilities of neural networks by coupling them to external memory resources, which they can interact with by attentional processes. The combined system is analogous to a Turing Machine or Von Neumann architecture but is differentiable end-to-end, allowing it to be efficiently trained with gradient descent. Preliminary results demonstrate that Neural Turing Machines can infer simple algorithms such as copying, sorting, and associative recall from input and output examples. ...

Visual summary


Sparse Algorithms Are Not Stable: A No-Free-Lunch Theorem

Pattern Analysis and Machine Intelligence, IEEE Transactions on, Vol. 34, No. 1. (January 2012), pp. 187-193,


We consider two desired properties of learning algorithms: *sparsity* and *algorithmic stability*. Both properties are believed to lead to good generalization ability. We show that these two properties are fundamentally at odds with each other: a sparse algorithm cannot be stable and vice versa. Thus, one has to trade off sparsity and stability in designing a learning algorithm. In particular, our general result implies that $\ell_1$-regularized regression (Lasso) cannot be stable, while $\ell_2$-regularized regression is known to have strong stability properties ...


A survey of multiple classifier systems as hybrid systems

Information Fusion, Vol. 16 (March 2014), pp. 3-17,


A current focus of intense research in pattern classification is the combination of several classifier systems, which can be built following either the same or different models and/or datasets building approaches. These systems perform information fusion of classification decisions at different levels overcoming limitations of traditional approaches based on single classifiers. This paper presents an up-to-date survey on multiple classifier system (MCS) from the point of view of Hybrid Intelligent Systems. The article discusses major issues, such as diversity and decision ...


Comparing and combining physically-based and empirically-based approaches for estimating the hydrology of ungauged catchments

Journal of Hydrology, Vol. 508 (January 2014), pp. 227-239,


[Highlights] [::] Methods for estimating various hydrological indices at ungauged sites were compared. [::] Methods included a TopNet rainfall-runoff model and a Random Forest empirical model. [::] TopNet estimates were improved through correction using Random Forest estimates. [::] Random Forests provided the best estimates of all indices except mean flow. [::] Mean flow was best estimated using an already published empirical method. [Summary] Predictions of hydrological regimes at ungauged sites are required for various purposes such as setting environmental flows, assessing availability of water resources or ...


Shifts in Arctic vegetation and associated feedbacks under climate change

Nature Climate Change, Vol. 3, No. 7. (31 July 2013), pp. 673-677,


Climate warming has led to changes in the composition, density and distribution of Arctic vegetation in recent decades1, 2, 3, 4. These changes cause multiple opposing feedbacks between the biosphere and atmosphere5, 6, 7, 8, 9, the relative magnitudes of which will have globally significant consequences but are unknown at a pan-Arctic scale10. The precise nature of Arctic vegetation change under future warming will strongly influence climate feedbacks, yet Earth system modelling studies have so far assumed arbitrary increases in shrubs ...


A statistical explanation of MaxEnt for ecologists

Diversity and Distributions, Vol. 17, No. 1. (1 January 2011), pp. 43-57,


MaxEnt is a program for modelling species distributions from presence-only species records. This paper is written for ecologists and describes the MaxEnt model from a statistical perspective, making explicit links between the structure of the model, decisions required in producing a modelled distribution, and knowledge about the species and the data that might affect those decisions. To begin we discuss the characteristics of presence-only data, highlighting implications for modelling distributions. We particularly focus on the problems of sample bias and lack ...


Knowledge discovery by accuracy maximization

Proceedings of the National Academy of Sciences, Vol. 111, No. 14. (24 April 2014), pp. 201220873-5122,


[Significance] We propose an innovative method to extract new knowledge from noisy and high-dimensional data. Our approach differs from previous methods in that it has an integrated procedure of validation of the results through maximization of cross-validated accuracy. In many cases, this method performs better than existing feature extraction methods and offers a general framework for analyzing any kind of complex data in a broad range of sciences. Examples ranging from genomics and metabolomics to astronomy and linguistics show the versatility ...


Mapping land cover from detailed aerial photography data using textural and neural network analysis

International Journal of Remote Sensing, Vol. 28, No. 7. (1 April 2007), pp. 1625-1642,


Automated mapping of land cover using black and white aerial photographs, as an alternative method to traditional photo?interpretation, requires using methods other than spectral analysis classification. To this end, textural measurements have been shown to be useful indicators of land cover. In this work, a neural network model is proposed and tested to map historical land use/land cover (LUC) from very detailed panchromatic aerial photographs (5 m resolution) using textural measurements. The method is used to identify different land use and management ...


Vulnerability of Pinus cembra L. in the Alps and the Carpathian mountains under present and future climates

Forest Ecology and Management, Vol. 259, No. 4. (05 February 2010), pp. 750-761,


Proactive management should be applied within a forest conservation context to prevent extinction or degradation of those forest ecosystems that we suspect will be affected by global warming in the next century. The aim of this study is to estimate the vulnerability under climate change of a localized and endemic tree species Pinus cembra that occurs in the alpine timberline. We used the Random Forest ensemble classifier and available bioclimatic and ecological data to model present and future suitable areas for ...


Novel methods improve prediction of species' distributions from occurrence data

Ecography, Vol. 29, No. 2. (1 April 2006), pp. 129-151,


Prediction of species’ distributions is central to diverse applications in ecology, evolution and conservation science. There is increasing electronic access to vast sets of occurrence records in museums and herbaria, yet little effective guidance on how best to use this information in the context of numerous approaches for modelling distributions. To meet this need, we compared 16 modelling methods over 226 species from 6 regions of the world, creating the most comprehensive set of model comparisons to date. We used presence-only ...


The Need for Open Source Software in Machine Learning

J. Mach. Learn. Res., Vol. 8 (December 2007), pp. 2443-2466


Open source tools have recently reached a level of maturity which makes them suitable for building large-scale real-world systems. At the same time, the field of machine learning has developed a large body of powerful learning algorithms for diverse applications. However, the true potential of these methods is not used, since existing implementations are not openly shared, resulting in software with low usability, and weak interoperability. We argue that this situation can be significantly improved by increasing incentives for researchers to ...


Batch mode reinforcement learning based on the synthesis of artificial trajectories

In Annals of Operations Research, Vol. 208, No. 1. (2013), pp. 383-416,


In this paper, we consider the batch mode reinforcement learning setting, where the central problem is to learn from a sample of trajectories a policy that satisfies or optimizes a performance criterion. We focus on the continuous state space case for which usual resolution schemes rely on function approximators either to represent the underlying control problem or to represent its value function. As an alternative to the use of function approximators, we rely on the synthesis of “artificial trajectories” from the ...


Waffles: A Machine Learning Toolkit

Journal of Machine Learning Research, Vol. 12 ( 2011), pp. 2383-2387


We present a breadth-oriented collection of cross-platform command-line tools for researchers in machine learning called Waffles. The Waffles tools are designed to offer a broad spectrum of functionality in a manner that is friendly for scripted automation. All functionality is also available in a C++ class library. Waffles is available under the GNU Lesser General Public License. ...


A few useful things to know about machine learning

Commun. ACM, Vol. 55, No. 10. (October 2012), pp. 78-87,


Machine learning algorithms can figure out how to perform important tasks by generalizing from examples. This is often feasible and cost-effective where manual programming is not. As more data becomes available, more ambitious problems can be tackled. As a result, machine learning is widely used in computer science and other fields. However, developing successful machine learning applications requires a substantial amount of “black art” that is hard to find in textbooks. This article summarizes twelve key lessons that machine learning researchers ...


The rise and fall of supervised machine learning techniques

Bioinformatics, Vol. 27, No. 24. (15 December 2011), pp. 3331-3332,


Machine learning is of immense importance in bioinformatics and biomedical science more generally (Larrañaga et al., 2006; Tarca et al., 2007). In particular, supervised machine learning has been used to great effect in numerous bioinformatics prediction methods. Through many years of editing and reviewing manuscripts, we noticed that some supervised machine learning techniques seem to be gaining in popularity while others seemed, at least to our eyes, to be looking ‘unfashionable’. ...



Machine Learning In Machine Learning, Vol. 8, No. 3-4. (1 May 1992), pp. 279-292,
Keywords: machine-learning   q-learning  


Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states. This paper presents and proves in detail a convergence theorem for Q-learning based on that outlined in Watkins (1989). We show that Q-learning converges to the optimum action-values with probability 1 so long ...


Dynamic programming applications in water resources

Water Resources Research, Vol. 18, No. 4. (1982), null,


The central intention of this survey is to review dynamic programming models for water resource problems and to examine computational techniques which have been used to obtain solutions to these problems. Problem areas surveyed here include aqueduct design, ...

This page of the database may be cited as:
Integrated Natural Resources Modelling and Management - Meta-information Database.

Publication metadata

Bibtex, RIS, RSS/XML feed, Json, Dublin Core

Meta-information Database (INRMM-MiD).
This database integrates a dedicated meta-information database in CiteULike (the CiteULike INRMM Group) with the meta-information available in Google Scholar, CrossRef and DataCite. The Altmetric database with Article-Level Metrics is also harvested. Part of the provided semantic content (machine-readable) is made even human-readable thanks to the DCMI Dublin Core viewer. Digital preservation of the meta-information indexed within the INRMM-MiD publication records is implemented thanks to the Internet Archive.
The library of INRMM related pubblications may be quickly accessed with the following links.
Search within the whole INRMM meta-information database:
Search only within the INRMM-MiD publication records:
Full-text and abstracts of the publications indexed by the INRMM meta-information database are copyrighted by the respective publishers/authors. They are subject to all applicable copyright protection. The conditions of use of each indexed publication is defined by its copyright owner. Please, be aware that the indexed meta-information entirely relies on voluntary work and constitutes a quite incomplete and not homogeneous work-in-progress.
INRMM-MiD was experimentally established by the Maieutike Research Initiative in 2008 and then improved with the help of several volunteers (with a major technical upgrade in 2011). This new integrated interface is operational since 2014.