krangl Release History

v0.18

Added support for arrow (Thanks to @Kopilov for contributing PR 150)
Improved support for large Excels tables (Thanks to @ayvazj for contributing PR 126)
Added second version of unfold() to work with property accessors instead

cars.unfold("cars", listOf(Car::brand, Car::ps))

Minor enhancements

Fixed #63: Can not print schema() of empty data-frame

v0.17

Released 2021-07-17

New Jupyter kernel integration with auto-import and improved rendering for DataFrame and DataFrame.schema()
Added DataFrame.letsPlot() to ease integration with lets-plots
New tutorial (jupyter notebook): Mammalian Sleep
Updated to kotlin v1.5 and added supported for value class in List<Any>.toDataFrame() and DataFrame.unfold()
Added timestamp support for database support API (fixes #124)

v0.16

Released 2021-04-13

krangl is now deployed to maven-central and no no longer ot jcenter

Features

Added support for fixed-width files with readFixedWidth()
Added supported for more compact column type specification when reading tsv
Fixed: NA and emtpy cell handling in excel-reader
Fixed: Use correct cell types when writing Excel file

v0.15.6

Republished to maven central https://search.maven.org/artifact/com.github.holgerbrandl.krangl/krangl

v0.15.2

Fixed gather conversion in case of mixed number types
Indicate guessed column type with prefix Any for basic types in schema and print

v0.15.1

Fixed asDataFrame to include parent type properties
Added DataFrame.filterNotNull to remove records will nulls. A column selector can be provided to check only a subset of columns.

v0.15

New Features

#97 Added Excel read/write support (by LeandroC89)

// read
df = DataFrame.readExcel("data.xlsx", sheetName = "sales")
df = DataFrame.readExcel("data.xlsx", cellRange = CellRangeAddress.valueOf("A1:D10"))

// write
df.writeExcel("results.xslx")

#95 Improved column type casts

dataFrameOf("foo")(1, 2, 3).addColumn("stringified_foo") { it["foo"].toStrings() }.schema()
> DataFrame with 3 observations
> foo              [Int]  1, 2, 3
> stringified_foo  [Str]  1, 2, 3

dataFrameOf("foo")("1", "2", "3").addColumn("parsed_foo") { it["foo"].toInts() }.schema()

> DataFrame with 3 observations
> foo         [Str]  1, 2, 3
> parsed_foo  [Int]  1, 2, 3

#99 Added filtering by list (similar to R's %in% operator)

irisData.filter { it["Species"].inList("setosa", "versicolor")  }

Bug Fixes

#84 Builder now supports mixed numbers in column
#96 & #94 Fixed bugs in join
#100 Improved SQL bindings
#99 Fixed median

v0.14

Fixed missing by values overhanging RHS in outer join (fixes #94)
Added addRow (via PR92 by LeandroC89
Added column type text to sql interface (fixes #72)

v0.13

Released: 2020-06-02

Added column transformation to calculate cumulative sum cumSum

sales
    .sortedBy("quarter")
    .addColumn("cum_sales" to { it["sold_units"].cumSum()})

Added column transformation pctChange to calculate percentage change between the current and a prior element. similar to pct_change in pandas (contributed by @amorphous1 in PR85)

sales
    .groupBy("product")
    .addColumn("sales_pct_change" to { it["sold_units"].pctChange() })

Added lead and lag (contributed by @amorphous1 in PR85)

sales
    .groupBy("product")
    .sortedBy("quarter")
    .addColumn("prev_quarter_sales" to { it["sold_units"].lag() })

Significantly improved join performance (contributed by @amorphous1 in PR85)
New: Extended bindRows API to combine data rowwise (see PR #77 by @CrystalLord)

val person1 = mapOf("person" to "James", "year" to 1996)
val person2 = mapOf("person" to "Anne", "year" to 1998)

emptyDataFrame().bindRows(person1, person2).print()

v0.12

internal release

v0.11

New: Added built-it support for Long columns (PR #69 by @davidpedrosa)

v0.10

Major:

New: summarizeAt for simplified column aggregations
New: setNames to replace column headers of a data-frame
New: Deparse Iterables more conveniently using lambdas in deparseRecords

Minor:

Fixed: Can not read csv-tables without header
Added option to skip lines in csv reader.
Fixed schema() should no throw memory exception (#53: )
Fixed DataFrame.readTSV default format (#56)
Added where() for conditional column creation (relates to #54)
Added writeTSV
Fixed grouping by Any columns
Added: toDoubleMatrix() helper extension method

v0.9.1

Major Enhancements

DataFrame.fromJson will now flatten nested json data

Minor

Added sum() extension for columns summaries/transformation
Added dataFrameOf() that accepts Iterable of names
Added bindRows() alias that accepts data frames as varargs
Added bindCols() extension for list of DataCol
Fill missing cells with NA in bindRows and bindCols
Resolve duplicated column names in bindCols()
Added new builder to create data-frame from DataFrameRow iterator
Added addRowNumber to add the row number as column to a data-frame
Fixed: Incorrect types in gathered columns

v0.9

Released 2018-04-11

Major Enhancements

Allow index access for column model (fixes #46): irisData[1][2]
Improved DataFrame.count to respect existing groupings and to simply count rows if no grouping is defined
Added moveLeft and moveRight to rearrange column order
Added nest and unnest to wrap columns into sub-tables and back
Added expand and complete to expand column value-sets into data-frames
Added function literal support for count and groupBy (fixes #48): irisData.groupByExpr{ it["Sepal.Width"] > 3 }
Added receiver context for sortBy lambdas with sorting specific API (fixes #44)

Improved data-frame rendering

Improved print()ing of data-frames and schema()ta to have better alignment and more formatting options
Print row numbers by default when using print (fixes #49)

Minor Enhancements

Renamed select2/remove2 to selectIf and removeIF
Fixed #39: Can not add scalar object as column
Started submodule for documentation
Hide columns in print after exceeding maximum line length (fixes #50)
Fixed #45: sleepData.sortedBy{ "order" } should fail with informative exception

v0.8

Released 2018-03-21

Major Enhancements

Added property unfolding df.unfold<Person("user", properties=listOf("address"))
Added text matching helper: irisData.filter{ it["Species"].isMatching{ startsWith("se") }} (fixes #21)
Added sortedByDescending and desc and added more sorting tests
Added More elegant object bindings via reflection. Example val objPersons : Iterable<User> = users.rowsAs<User>() (fixes #22)
Added compressed csv write support, configurable or by filename guessing

Minor Enhancements

More robust row to object conversion
Made List<Boolean?>.not() public
Use regex instead of string as separate separator
Replaced fixed temporary column names with uuids
Fixed incorrect coercion of incomplete inplace data to df
Added concat operator for string column arithmetics
Fixed arithmetic comparison operators
Added beakerx display adapter

v0.7

Released 2018-03-14

Major Enhancements

Allow specifying column types when reading csv data (Thanks to LeanderG for providing the PR)
Added groupedBy to provide distinct set of grouping tuples as data-frame
Read support for URLs (Example DataFrame.readCSV("https://git.io/vxks7").glimpse())
Added basic read/write support for JSON data
Added generic collection conversion Iterable<Any>.asDataFrame() via reflection (fixes #24)

Incompatible API changes

Renamed structure to columnTypes
Renamed all table read function from .from* to .read*
Fixed #29: mapNonNull should use parameter and not receiver

Minor Enhancements

Namespace cleanup to hide internal helpers
Bundled irisData
Enhanced: DataCol.toDouble() should work for int columns as well (same vv)
Added MIT License
Use iterable instead of list for object conversions

v0.6

Released: 2017-11-11

More idiomatic API mimicking kotlin stdlib where possible
Added DataFrame.remove to drop columns from data-frames
Added DataFrame.addColumn to add column from data-frames
Added DataFrame.sortBy(TableFormula)
Added DataFrame.filterByRow
Reworked column selector API
Changed column expression API from Any to a constrained set of support types
Fixed issues when combining columns of different types (e.g. DoubleCol + IntCol
Dropped most unary operators

v0.5

Skipped.

v0.4

released on 2017-4-12

New Features

spread()-gather() support for elegant data reshaping (fixes #2)
Improve reshaping functionality by adding unite and separate (fixes #9)
Added sampleFrac() and sampleN() for random sub-sampling of data-frames (either with or without replacement)

Important Bug Fixes

mutate() can now change existing columns without altering column positions

Other

New property accessor DataFrame.cols to access all columns of a data-frame
Incremented kotlin version to 1.1

v0.3

Initial Release

Implement all dplyr core verbs
Implement all join types
Table write support using csv-commons wrapper
Extensive unit test coverage =
TravisCI integration
Support for count() and distinct()
Basic benchmarking framework (without jvm usage)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

krangl Release History

v0.18

v0.17

v0.16

v0.15.6

v0.15.2

v0.15.1

v0.15

v0.14

v0.13

v0.12

v0.11

v0.10

v0.9.1

v0.9

v0.8

v0.7

v0.6

v0.5

v0.4

v0.3

FilesExpand file tree

CHANGES.md

Latest commit

History

CHANGES.md

File metadata and controls

krangl Release History

v0.18

v0.17

v0.16

v0.15.6

v0.15.2

v0.15.1

v0.15

v0.14

v0.13

v0.12

v0.11

v0.10

v0.9.1

v0.9

v0.8

v0.7

v0.6

v0.5

v0.4

v0.3