dandelion

Bot releases are visible (Hide)

dandelion - v0.3.5 Latest Release

Published by zktuong 9 months ago

What's Changed

fix container script bug by @zktuong in https://github.com/zktuong/dandelion/pull/350
reorder content table on docs by @zktuong in https://github.com/zktuong/dandelion/pull/352
fix entry of anndata with NaN 'sequence_id' values by @amoschoomy in https://github.com/zktuong/dandelion/pull/351
add dependabot dependency review for PR by @zktuong in https://github.com/zktuong/dandelion/pull/353
add else statement to check contigs when there's no sequence by @MeganS92 in https://github.com/zktuong/dandelion/pull/354
pip prod(deps): update pandas requirement from <=2.1.4,>=1.0.3 to >=1.0.3,<=2.2.0 by @dependabot in https://github.com/zktuong/dandelion/pull/355
pip dev(deps-dev): update sphinx-autodoc-typehints requirement from <=1.25.2 to <=1.25.3 by @dependabot in https://github.com/zktuong/dandelion/pull/356
pip dev(deps-dev): update scirpy requirement from <=0.14 to <=0.15.0 by @dependabot in https://github.com/zktuong/dandelion/pull/357
convert to use umi_count by @zktuong in https://github.com/zktuong/dandelion/pull/358

New Contributors

@MeganS92 made their first contribution in https://github.com/zktuong/dandelion/pull/354

Full Changelog: https://github.com/zktuong/dandelion/compare/v0.3.4...v0.3.5

dandelion - v0.3.4

Published by zktuong 9 months ago

Summary

Speed up network generation in generate_network
Add soft filtering and normalisation to vdj_psuedobulk functions - @ktpolanski
Created a new column in .data (extra) to flag if contig is considered extra.
New clone id definition to insert VDJ and VJ to the id to reduce ambiguity - need to check if it does it properly for cells with no clone ids. This also means that now clone ids can be created for orphan chains.
New to_scirpy/from_scirpy functions that will now convert them to the new scverse airr formats - @amoschoomy
Container build is now simplified and uses mamba to manage all the dependencies.
New option to run preprocessing with ogrdb references in both the base package and the container.
New reference download function in the container folder to ensure the latest references are pulled for every new iteration of the container.
Deprecate support for python3.7 tests.

What's Changed

create a remove_malformed option by @zktuong in https://github.com/zktuong/dandelion/pull/315
fix future warning for multimappers by @zktuong in https://github.com/zktuong/dandelion/pull/316
fix cleaning up step in the singularity image by @zktuong in https://github.com/zktuong/dandelion/pull/317
Speed up network generation by @zktuong in https://github.com/zktuong/dandelion/pull/319
remove d gene VJ columns by @zktuong in https://github.com/zktuong/dandelion/pull/320
Running palantir by @zktuong in https://github.com/zktuong/dandelion/pull/321
update workflow dependencies by @zktuong in https://github.com/zktuong/dandelion/pull/323
Fix #325 by @zktuong in https://github.com/zktuong/dandelion/pull/326
add concat in tutorial by @zktuong in https://github.com/zktuong/dandelion/pull/327
Update 3_dandelion_findingclones-10x_data.ipynb by @tnieuwe in https://github.com/zktuong/dandelion/pull/328
disable 3.7 tests by @zktuong in https://github.com/zktuong/dandelion/pull/330
add logic to deal with contigs with no clone id during network generation step by @zktuong in https://github.com/zktuong/dandelion/pull/335
Soft filtering in setup_vdj_pseudobulk() by @ktpolanski in https://github.com/zktuong/dandelion/pull/334
attempting to reduce code bloat for find_clones by @zktuong in https://github.com/zktuong/dandelion/pull/329
fix bug with check contigs by @zktuong in https://github.com/zktuong/dandelion/pull/341
add an option to the singularity container to skip tigger step. by @zktuong in https://github.com/zktuong/dandelion/pull/343
add new option to check contigs by @zktuong in https://github.com/zktuong/dandelion/pull/344
simplify build container by @zktuong in https://github.com/zktuong/dandelion/pull/346
Awkward to pandas by @amoschoomy in https://github.com/zktuong/dandelion/pull/342
Fix yml by @zktuong in https://github.com/zktuong/dandelion/pull/349
fix container build definition file by @zktuong in https://github.com/zktuong/dandelion/pull/348

dependabot updates

pip prod(deps): update pandas requirement from <=2.1.0,>=1.0.3 to >=1.0.3,<=2.1.1 by @dependabot in https://github.com/zktuong/dandelion/pull/314
pip dev(deps-dev): update readthedocs-sphinx-ext requirement from <=2.2.2 to <=2.2.3 by @dependabot in https://github.com/zktuong/dandelion/pull/318
pip prod(deps): update pandas requirement from <=2.1.1,>=1.0.3 to >=1.0.3,<=2.1.2 by @dependabot in https://github.com/zktuong/dandelion/pull/324
pip dev(deps-dev): update sphinx-autodoc-typehints requirement from <=1.24.0 to <=1.24.1 by @dependabot in https://github.com/zktuong/dandelion/pull/331
pip prod(deps): update pandas requirement from <=2.1.2,>=1.0.3 to >=1.0.3,<=2.1.3 by @dependabot in https://github.com/zktuong/dandelion/pull/332
pip dev(deps-dev): update sphinx-autodoc-typehints requirement from <=1.24.1 to <=1.25.2 by @dependabot in https://github.com/zktuong/dandelion/pull/333
pip dev(deps-dev): update sphinx-rtd-theme requirement from <=1.2.2 to <=2.0.0 by @dependabot in https://github.com/zktuong/dandelion/pull/338
pip prod(deps): update pandas requirement from <=2.1.3,>=1.0.3 to >=1.0.3,<=2.1.4 by @dependabot in https://github.com/zktuong/dandelion/pull/339
pip dev(deps-dev): update readthedocs-sphinx-ext requirement from <=2.2.3 to <=2.2.4 by @dependabot in https://github.com/zktuong/dandelion/pull/340
pip dev(deps-dev): update readthedocs-sphinx-ext requirement from <=2.2.4 to <=2.2.5 by @dependabot in https://github.com/zktuong/dandelion/pull/345
Bump tj-actions/changed-files from 35 to 41 in /.github/workflows by @dependabot in https://github.com/zktuong/dandelion/pull/347

New Contributors

@tnieuwe made their first contribution in https://github.com/zktuong/dandelion/pull/328
@amoschoomy made their first contribution in https://github.com/zktuong/dandelion/pull/342

Full Changelog: https://github.com/zktuong/dandelion/compare/v0.3.3...v0.3.4

dandelion - v0.3.3

Published by zktuong about 1 year ago

What's Changed

Mainly updates and bug fixes to tl.clone_overlap and pl.clone_overlap.
simplified pre-processing functions to call command line arguments instead of running within the code.

Detailed notes:

Update docs for clone overlap by @zktuong in https://github.com/zktuong/dandelion/pull/276
Allow additional arguments in define_clones by @zktuong in https://github.com/zktuong/dandelion/pull/280
add an if statement to check if actor is dependabot by @zktuong in https://github.com/zktuong/dandelion/pull/289
pip dev(deps-dev): update sphinx-autodoc-typehints requirement from <=1.23.0 to <=1.23.3 by @dependabot in https://github.com/zktuong/dandelion/pull/284
pip dev(deps-dev): update sphinx-rtd-theme requirement from <=1.2.0 to <=1.2.2 by @dependabot in https://github.com/zktuong/dandelion/pull/285
pip dev(deps-dev): update readthedocs-sphinx-ext requirement from <=2.2.0 to <=2.2.2 by @dependabot in https://github.com/zktuong/dandelion/pull/286
pip dev(deps-dev): update nbsphinx requirement from <=0.9.1 to <=0.9.2 by @dependabot in https://github.com/zktuong/dandelion/pull/287
enable auto-merge for dependabot by @zktuong in https://github.com/zktuong/dandelion/pull/290
refactoring how external scripts and locations are called by @zktuong in https://github.com/zktuong/dandelion/pull/288
fix reassign_alleles by @zktuong in https://github.com/zktuong/dandelion/pull/293
remove deprecated function from docs by @zktuong in https://github.com/zktuong/dandelion/pull/297
pip dev(deps-dev): update sphinx-autodoc-typehints requirement from <=1.23.3 to <=1.24.0 by @dependabot in https://github.com/zktuong/dandelion/pull/296
fix weekly tests by @zktuong in https://github.com/zktuong/dandelion/pull/301
pip prod(deps): update mizani requirement from <0.10.0 to <0.11.0 by @dependabot in https://github.com/zktuong/dandelion/pull/302
add options to plotting clone overlap by @zktuong in https://github.com/zktuong/dandelion/pull/307
add requirements.txt by @zktuong in https://github.com/zktuong/dandelion/pull/309
should be cartesian product instead of combination by @zktuong in https://github.com/zktuong/dandelion/pull/312

Full Changelog: https://github.com/zktuong/dandelion/compare/v0.3.2...v0.3.3

dandelion - v0.3.2

Published by zktuong over 1 year ago

What's Changed

Mainly to fix compatibility with dependencies.

minor wording/renaming tweaks in tutorial by @ktpolanski in https://github.com/zktuong/dandelion/pull/252
ensure additional column names are present in strict mode by @zktuong in https://github.com/zktuong/dandelion/pull/253
fix weekly tests by @zktuong in https://github.com/zktuong/dandelion/pull/255
fix singularity preprocessing for org by @zktuong in https://github.com/zktuong/dandelion/pull/256
Easy docs2 by @zktuong in https://github.com/zktuong/dandelion/pull/260
Update _network.py by @zktuong in https://github.com/zktuong/dandelion/pull/258
fix requirements by @zktuong in https://github.com/zktuong/dandelion/pull/262
Create dependabot.yml by @zktuong in https://github.com/zktuong/dandelion/pull/263
pip prod(deps): update pandas requirement from <1.5.0,>=1.0.3 to >=1.0.3,<2.1.0 by @dependabot in https://github.com/zktuong/dandelion/pull/264
getting rid of CI warnings by @zktuong in https://github.com/zktuong/dandelion/pull/265
update actions by @zktuong in https://github.com/zktuong/dandelion/pull/266
barplot bug by @zktuong in https://github.com/zktuong/dandelion/pull/267
fix query behaviour for merge by @zktuong in https://github.com/zktuong/dandelion/pull/268
Update colab doc by @zktuong in https://github.com/zktuong/dandelion/pull/271

New Contributors

@dependabot made their first contribution in https://github.com/zktuong/dandelion/pull/264

Full Changelog: https://github.com/zktuong/dandelion/compare/v0.3.1...v0.3.2

dandelion - v0.3.1

Published by zktuong over 1 year ago

What's Changed

Just to update pypi - Some bug fixes to accompany the revision
Doesn't affect the container image (but i should add a tag on sylabs to also call it 0.3.1 just to be consisten).

Enforce pickle priorty to be 4 by @zktuong in https://github.com/zktuong/dandelion/pull/216
minor doc aesthetic updates by @zktuong in https://github.com/zktuong/dandelion/pull/218
add citation to preprint by @zktuong in https://github.com/zktuong/dandelion/pull/221
Does this help? by @zktuong in https://github.com/zktuong/dandelion/pull/222
add missing function to api by @zktuong in https://github.com/zktuong/dandelion/pull/223
update-docstrings by @zktuong in https://github.com/zktuong/dandelion/pull/224
Minor behavior update by @zktuong in https://github.com/zktuong/dandelion/pull/228
fix re-indexing issue by @zktuong in https://github.com/zktuong/dandelion/pull/229
parse main calls by @zktuong in https://github.com/zktuong/dandelion/pull/231
add mouse-preprocess by @zktuong in https://github.com/zktuong/dandelion/pull/237
vdj mapping causing an issue? by @zktuong in https://github.com/zktuong/dandelion/pull/241
revert awk change by @zktuong in https://github.com/zktuong/dandelion/pull/242
fix macos tests by @zktuong in https://github.com/zktuong/dandelion/pull/243
add logic for jmultimap checking by @zktuong in https://github.com/zktuong/dandelion/pull/244
Revert "add logic for jmultimap checking" by @zktuong in https://github.com/zktuong/dandelion/pull/245
fix empty columns by @zktuong in https://github.com/zktuong/dandelion/pull/246
Update _core.py by @zktuong in https://github.com/zktuong/dandelion/pull/247
increased transparency tutorial by @ktpolanski in https://github.com/zktuong/dandelion/pull/248
update email by @zktuong in https://github.com/zktuong/dandelion/pull/249
Return gamma delta notebook by @zktuong in https://github.com/zktuong/dandelion/pull/250

Full Changelog: https://github.com/zktuong/dandelion/compare/v0.3.0...v0.3.1

dandelion - v0.3.0

Published by zktuong almost 2 years ago

What's Changed

This release adds a number of new features and minor restructuring to accompany Dandelion's manuscript (uploading soon). Kudos to @suochenqu and @ktpolanski

data strategy to handle non-productive contigs, partial contigs and 'J multi-mappers'
new V(D)J pseudotime trajectory inference!
revamped tutorials and documents

Detailed PRs

multimappers by @zktuong in https://github.com/zktuong/dandelion/pull/165
Update environment.yml by @zktuong in https://github.com/zktuong/dandelion/pull/168
fix-typo by @zktuong in https://github.com/zktuong/dandelion/pull/172
add J multimap to BCR workflow by @zktuong in https://github.com/zktuong/dandelion/pull/178
fix pandas dependency by @zktuong in https://github.com/zktuong/dandelion/pull/181
fix unreference variable 182 by @zktuong in https://github.com/zktuong/dandelion/pull/183
add trajectory utils by @suochenqu in https://github.com/zktuong/dandelion/pull/185
select left most J call in multimappers by @zktuong in https://github.com/zktuong/dandelion/pull/186
update container definitions by @zktuong in https://github.com/zktuong/dandelion/pull/187
fix the column names by @zktuong in https://github.com/zktuong/dandelion/pull/188
Update _trajectory.py by @zktuong in https://github.com/zktuong/dandelion/pull/189
Pseudobulking improvements by @ktpolanski in https://github.com/zktuong/dandelion/pull/193
Further tweaks by @ktpolanski in https://github.com/zktuong/dandelion/pull/194
Update api.rst by @zktuong in https://github.com/zktuong/dandelion/pull/197
save calculate threshold plot by @zktuong in https://github.com/zktuong/dandelion/pull/200
Adjust toggling of productive/non-productive filtering for setup pseudobulk by @zktuong in https://github.com/zktuong/dandelion/pull/201
compute_pseudobulk_gex by @zktuong in https://github.com/zktuong/dandelion/pull/202
change update_metadata to always reinitialise by @zktuong in https://github.com/zktuong/dandelion/pull/203
two quick pseudobulking fixes by @ktpolanski in https://github.com/zktuong/dandelion/pull/204
Customise setup pseudobulk by @zktuong in https://github.com/zktuong/dandelion/pull/205
refactor pseudobulking, update gex by @ktpolanski in https://github.com/zktuong/dandelion/pull/206
add tests by @zktuong in https://github.com/zktuong/dandelion/pull/208
singularity changeo pipeline by @ktpolanski in https://github.com/zktuong/dandelion/pull/207
quickstart tutorial by @ktpolanski in https://github.com/zktuong/dandelion/pull/209
place nxviz as an external submodule by @zktuong in https://github.com/zktuong/dandelion/pull/210
update notebooks by @zktuong in https://github.com/zktuong/dandelion/pull/211
Update external by @zktuong in https://github.com/zktuong/dandelion/pull/212
fix api documentation styling by @zktuong in https://github.com/zktuong/dandelion/pull/214

New Contributors

@suochenqu made their first contribution in https://github.com/zktuong/dandelion/pull/185

Full Changelog: https://github.com/zktuong/dandelion/compare/v0.2.4...v0.3.0

dandelion - v0.2.4

Published by zktuong over 2 years ago

What's Changed

slicing and check contigs by @zktuong in https://github.com/zktuong/dandelion/pull/159
add new functions and rework github actions by @zktuong in https://github.com/zktuong/dandelion/pull/161

New features

slicing functionality

the Dandelion object can now be sliced like a AnnData, or pandas DataFrame!

vdj[vdj.data['productive'] == 'T']
Dandelion class object with n_obs = 38 and n_contigs = 94
    data: 'sequence_id', 'sequence', 'rev_comp', 'productive', 'v_call', 'd_call', 'j_call', 'sequence_alignment', 'germline_alignment', 'junction', 'junction_aa', 'v_cigar', 'd_cigar', 'j_cigar', 'stop_codon', 'vj_in_frame', 'locus', 'junction_length', 'np1_length', 'np2_length', 'cell_id', 'c_call', 'consensus_count', 'duplicate_count', 'rearrangement_status'
    metadata: 'locus_VDJ', 'locus_VJ', 'productive_VDJ', 'productive_VJ', 'v_call_VDJ', 'd_call_VDJ', 'j_call_VDJ', 'v_call_VJ', 'j_call_VJ', 'c_call_VDJ', 'c_call_VJ', 'junction_VDJ', 'junction_VJ', 'junction_aa_VDJ', 'junction_aa_VJ', 'v_call_B_VDJ', 'd_call_B_VDJ', 'j_call_B_VDJ', 'v_call_B_VJ', 'j_call_B_VJ', 'productive_B_VDJ', 'productive_B_VJ', 'v_call_abT_VDJ', 'd_call_abT_VDJ', 'j_call_abT_VDJ', 'v_call_abT_VJ', 'j_call_abT_VJ', 'productive_abT_VDJ', 'productive_abT_VJ', 'v_call_gdT_VDJ', 'd_call_gdT_VDJ', 'j_call_gdT_VDJ', 'v_call_gdT_VJ', 'j_call_gdT_VJ', 'productive_gdT_VDJ', 'productive_gdT_VJ', 'duplicate_count_B_VDJ', 'duplicate_count_B_VJ', 'duplicate_count_abT_VDJ', 'duplicate_count_abT_VJ', 'duplicate_count_gdT_VDJ', 'duplicate_count_gdT_VJ', 'isotype', 'isotype_status', 'locus_status', 'chain_status', 'rearrangement_status_VDJ', 'rearrangement_status_VJ'

vdj[vdj.metadata['productive_VDJ'] == 'T']
Dandelion class object with n_obs = 17 and n_contigs = 36
    data: 'sequence_id', 'sequence', 'rev_comp', 'productive', 'v_call', 'd_call', 'j_call', 'sequence_alignment', 'germline_alignment', 'junction', 'junction_aa', 'v_cigar', 'd_cigar', 'j_cigar', 'stop_codon', 'vj_in_frame', 'locus', 'junction_length', 'np1_length', 'np2_length', 'cell_id', 'c_call', 'consensus_count', 'duplicate_count', 'rearrangement_status'
    metadata: 'locus_VDJ', 'locus_VJ', 'productive_VDJ', 'productive_VJ', 'v_call_VDJ', 'd_call_VDJ', 'j_call_VDJ', 'v_call_VJ', 'j_call_VJ', 'c_call_VDJ', 'c_call_VJ', 'junction_VDJ', 'junction_VJ', 'junction_aa_VDJ', 'junction_aa_VJ', 'v_call_B_VDJ', 'd_call_B_VDJ', 'j_call_B_VDJ', 'v_call_B_VJ', 'j_call_B_VJ', 'productive_B_VDJ', 'productive_B_VJ', 'v_call_abT_VDJ', 'd_call_abT_VDJ', 'j_call_abT_VDJ', 'v_call_abT_VJ', 'j_call_abT_VJ', 'productive_abT_VDJ', 'productive_abT_VJ', 'v_call_gdT_VDJ', 'd_call_gdT_VDJ', 'j_call_gdT_VDJ', 'v_call_gdT_VJ', 'j_call_gdT_VJ', 'productive_gdT_VDJ', 'productive_gdT_VJ', 'duplicate_count_B_VDJ', 'duplicate_count_B_VJ', 'duplicate_count_abT_VDJ', 'duplicate_count_abT_VJ', 'duplicate_count_gdT_VDJ', 'duplicate_count_gdT_VJ', 'isotype', 'isotype_status', 'locus_status', 'chain_status', 'rearrangement_status_VDJ', 'rearrangement_status_VJ'

vdj[vdj.metadata_names.isin(['cell1', 'cell2', 'cell3', 'cell4', 'cell5'])]
Dandelion class object with n_obs = 5 and n_contigs = 20
data: 'sequence_id', 'sequence', 'rev_comp', 'productive', 'v_call', 'd_call', 'j_call', 'sequence_alignment', 'germline_alignment', 'junction', 'junction_aa', 'v_cigar', 'd_cigar', 'j_cigar', 'stop_codon', 'vj_in_frame', 'locus', 'junction_length', 'np1_length', 'np2_length', 'cell_id', 'c_call', 'consensus_count', 'duplicate_count', 'rearrangement_status'
metadata: 'locus_VDJ', 'locus_VJ', 'productive_VDJ', 'productive_VJ', 'v_call_VDJ', 'd_call_VDJ', 'j_call_VDJ', 'v_call_VJ', 'j_call_VJ', 'c_call_VDJ', 'c_call_VJ', 'junction_VDJ', 'junction_VJ', 'junction_aa_VDJ', 'junction_aa_VJ', 'v_call_B_VDJ', 'd_call_B_VDJ', 'j_call_B_VDJ', 'v_call_B_VJ', 'j_call_B_VJ', 'productive_B_VDJ', 'productive_B_VJ', 'v_call_abT_VDJ', 'd_call_abT_VDJ', 'j_call_abT_VDJ', 'v_call_abT_VJ', 'j_call_abT_VJ', 'productive_abT_VDJ', 'productive_abT_VJ', 'v_call_gdT_VDJ', 'd_call_gdT_VDJ', 'j_call_gdT_VDJ', 'v_call_gdT_VJ', 'j_call_gdT_VJ', 'productive_gdT_VDJ', 'productive_gdT_VJ', 'duplicate_count_B_VDJ', 'duplicate_count_B_VJ', 'duplicate_count_abT_VDJ', 'duplicate_count_abT_VJ', 'duplicate_count_gdT_VDJ', 'duplicate_count_gdT_VJ', 'isotype', 'isotype_status', 'locus_status', 'chain_status', 'rearrangement_status_VDJ', 'rearrangement_status_VJ'

vdj[vdj.data_names.isin(['contig1','contig2','contig3','contig4','contig5'])]
Dandelion class object with n_obs = 2 and n_contigs = 5
data: 'sequence_id', 'sequence', 'rev_comp', 'productive', 'v_call', 'd_call', 'j_call', 'sequence_alignment', 'germline_alignment', 'junction', 'junction_aa', 'v_cigar', 'd_cigar', 'j_cigar', 'stop_codon', 'vj_in_frame', 'locus', 'junction_length', 'np1_length', 'np2_length', 'cell_id', 'c_call', 'consensus_count', 'duplicate_count', 'rearrangement_status'
metadata: 'locus_VDJ', 'locus_VJ', 'productive_VDJ', 'productive_VJ', 'v_call_VDJ', 'd_call_VDJ', 'j_call_VDJ', 'v_call_VJ', 'j_call_VJ', 'c_call_VDJ', 'c_call_VJ', 'junction_VDJ', 'junction_VJ', 'junction_aa_VDJ', 'junction_aa_VJ', 'v_call_B_VDJ', 'd_call_B_VDJ', 'j_call_B_VDJ', 'v_call_B_VJ', 'j_call_B_VJ', 'productive_B_VDJ', 'productive_B_VJ', 'v_call_abT_VDJ', 'd_call_abT_VDJ', 'j_call_abT_VDJ', 'v_call_abT_VJ', 'j_call_abT_VJ', 'productive_abT_VDJ', 'productive_abT_VJ', 'v_call_gdT_VDJ', 'd_call_gdT_VDJ', 'j_call_gdT_VDJ', 'v_call_gdT_VJ', 'j_call_gdT_VJ', 'productive_gdT_VDJ', 'productive_gdT_VJ', 'duplicate_count_B_VDJ', 'duplicate_count_B_VJ', 'duplicate_count_abT_VDJ', 'duplicate_count_abT_VJ', 'duplicate_count_gdT_VDJ', 'duplicate_count_gdT_VJ', 'isotype', 'isotype_status', 'locus_status', 'chain_status', 'rearrangement_status_VDJ', 'rearrangement_status_VJ'

not sure implementing it like adata[:, adata.var.something] make sense as it's not really row information in the data slot?
also the base slot in Dandelion is .data, and doesn't make sense for .metadata to be the 'row'
maybe https://github.com/scverse/scirpy/issues/327 can come up with a better strategy and i can adopt that later on.

`ddl.pp.check_contigs`

created a new function ddl.pp.check_contigs as a way to just check if contigs are ambiguous, rather than outright removing them. I envisage that this will eventually replace simple mode in ddl.pp.filter_contigs in the future.
- new column in .data: ambiguous, T/F to indicate whether contig is considered ambiguous or not (different from cell level ambiguous).
- the .metadata and several other functions ignores any contigs marked as T to maintain the same behaviour
- The largest difference between ddl.pp.check_contigs and ddl.pp.filter_contigs is that the onus is on the user to remove any 'bad' cells from the GEX data (illustrated in the tutorial) with check_contigs whereas this happens semi-automatically with filter_contigs.

`ddl.update_metadata` now comes with a 'by_celltype' option

This brings a new feature - B cell, alpha-beta T cell and gamma-delta T cell associated columns for V,D,J,C and productive columns!
- this is achieved through a new .retrieve_celltype subfunction in the Query class, which breaks up the retrieval into the 3 major groups if by_celltype = True.
- No longer the need to guess which belongs to which and allows for easy slicing! This does cause a bit of .obs bloating.
- Which leads to the removal of constant_status_VDJ, constant_status_VJ, productive_status_VDJ, productive_status_VJ as the metadata is getting bloated with the slight rework of Dandelion metadata slot to account for the new B/abT/gdT columns

`tl.productive_ratio`

Calculates a cell-level representation of productive vs non-productive contigs.
- Plotting is achieved through pl.productive_ratio

`tl.vj_usage_pca`

Computes PCA on a cell-level representation of V/J gene usage across designated groupings
- uses scanpy.pp.pca internally
- Plotting can be achieved through scanpy.pl.pca

bug fixes

fix cell ordering issue https://github.com/scverse/scirpy/pull/347
small refactor of ddl.pp.filter_contigs
- moved some of the repetitive loops into callable functions
- deprecate filter_vj_chains argument and replaced with filter_extra_vdj_chains and filter_extra_vj_chains to hopefully enable more interpretable behaviour. fixes #158
- umi adjustment step was buggy but i have now made the behaviour consistent with how it functions in ddl.pp.check_contigs
rearrangement_status_VDJ and rearrangement_status_VJ (renamed from rearrangement_VDJ_status and rearrangement_VJ_status) from now gives a single value for whether a chimeric rearrangement occured e.g. TRDV pairing with TRAJ and TRAC as in this paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4267242/
fixed issues with progress bars getting out of hand
fixed issue with ddl.tl.find_clones crashing if more than 1 type of loci is found in the data.
- now a B, abT and gdT prefix will be appended to BCR/TR-ab/TR-gd clones.
check_contigs, find_clones and define_clones were removing non-productive contigs even though there's no need to. May cause issues with filter_contigs... but there's a problem for next time.
fix issue with min_size in network not behaving as intended. switch to using connected components to find which nodes to trim

other changes

new column chain_status, to summarise the reworked locus_status column.
- Should contain values like ambiguous, Orphan VDJ, Single pair etc, similar to chain_pairing in scirpy.
Also fixed the ordering of metadata to make it more presentable, instead of just randomly slotting into the data frame.
ddl.concat now allows for custom suffix/prefix - only operates on sequence_id
remove .edges from Dandelion class because this doesn't get used anywhere and it's also stored in the networkx graphs
minimum spanning tree construction performed using networkx directly so that i don't have to keep changing the adjacency matrices from pandas to networkx back and forth
clean up documentation slightly

Full Changelog: https://github.com/zktuong/dandelion/compare/v0.2.2...v0.2.4

dandelion - v0.2.3

Published by zktuong over 2 years ago

same as v0.2.2 but i seemed to have messed up the upload to pypi. so trying again.

What's Changed

try and add youtube video to docs by @zktuong in https://github.com/zktuong/dandelion/pull/148
testing_rpy2_update by @zktuong in https://github.com/zktuong/dandelion/pull/150
Speed upgrade - Refactor generate network by @zktuong in https://github.com/zktuong/dandelion/pull/152
remove nxviz from requirements by @zktuong in https://github.com/zktuong/dandelion/pull/157

Bug fixes and Improvements

Speed up generate_network
- pair-wise hamming distance is calculated on per clone/clonotype only if more than 1 cell is assigned to a clone/clonotype
- .distance slot is removed and is now directly stored/converted from the .graph slot.
- new options:
  - compute_layout: bool = True. If dataset is too large, generate_layout can be switched to False in which case only the networkx graph is returned. The data can still be visualised later with scirpy's plotting method (see below).
  - layout_method: Literal['sfdp', 'mod_fr'] = 'sfdp'. New default uses the ultra-fast C++ implemented sfdp_layout algorithm in graph-tools to generate final layout. sfdp stands for Scalable Force Directed Placement.
    - Minor caveat is that the repulsion is not as good - when there's a lot of singleton nodes, they don't separate well unless you some how work out which of the parameters in sfdp_layout to tweak will produce an effective separate. changing gamma alone doesn't really seem to do much.
    - The original layout can still be generated by specifying layout_method = 'mod_fr'. Requires a separate installation of graph-tool via conda (not managed by pip) as it has several C++ dependencies.
    - pytest on macos may also stall because of a different backend being called - this is solved by changing tests that calls generate_network to run last.
- added steps to reduce memory hogging.
- min_size was doing the opposite previously and this is now fixed. #155
Speed up transfer
- Found a faster way to create the connectivity matrix.
- this also now transfer a dictionary that scirpy can use to generate the plots https://github.com/scverse/scirpy/issues/286
- Fix #153
  - rename productive to productive_status.
Fix #154
- reorder the if-else statements.
Speed up filter_contigs
- tree construction is simplified and replaced for-loops with dictionary updates.
Speed up initialise_metadata. Dandelion should now initialise and read faster.
- Removed an unnecessary data sanitization step when loading data.
- Now load_data will rename umi_count to duplicate_count
- Speed up Query
  - tree construction is simplified and replaced for-loops with dictionary updates.
  - didn't need to use an airr validator as that slows things down.
data initialised by Dandelion will be ordered based on productive first, then followed by umi count (largest to smallest).

Breaking Changes

initialise_metadata/update_metadata/Dandelion
- For-loops to initialise the object has veen vectorized, resulting in a minor speed uprade
- This results in reduction of some columns in the .metadata which were probably bloated and not used.
  - vdj_status and vdj_status_summary removed and replaced with rearrangement_VDJ_status and rearrange_VJ_status
  - constant_status and constant_summary removed and replaced with constant_VDJ_status and constant_VJ_status.
  - productive and productive_summary combined and replaced with productive_status.
  - locus_status and locus_status_summary combined and replaced with locus_status.
  - isotype_summary replaced with isotype_status.
where there was previously unassigned or '' has been changed to :str: None in .metadata.
- Not changed to NoneType as there's quite a bit of text processing internally that gets messed up if swapped.
- No_contig will still be populated after transfer to AnnData to reflect cells with no TCR/BCR info.
deprecate use of nxviz<0.7.4
- reworked code to use the updated version at https://github.com/zktuong/nxviz/tree/custom_color_mapping_circos_nodes_and_edges

Minor changes

Rename and deprecate read_h5/write_h5. Use of read_h5ddl/write_h5ddl will be enforced in the next update.

Full Changelog: https://github.com/zktuong/dandelion/compare/v0.2.1...v0.2.2

dandelion - v0.2.2

Published by zktuong over 2 years ago

What's Changed

try and add youtube video to docs by @zktuong in https://github.com/zktuong/dandelion/pull/148
testing_rpy2_update by @zktuong in https://github.com/zktuong/dandelion/pull/150
Speed upgrade - Refactor generate network by @zktuong in https://github.com/zktuong/dandelion/pull/152
remove nxviz from requirements by @zktuong in https://github.com/zktuong/dandelion/pull/157

Bug fixes and Improvements

Speed up generate_network
- pair-wise hamming distance is calculated on per clone/clonotype only if more than 1 cell is assigned to a clone/clonotype
- .distance slot is removed and is now directly stored/converted from the .graph slot.
- new options:
  - compute_layout: bool = True. If dataset is too large, generate_layout can be switched to False in which case only the networkx graph is returned. The data can still be visualised later with scirpy's plotting method (see below).
  - layout_method: Literal['sfdp', 'mod_fr'] = 'sfdp'. New default uses the ultra-fast C++ implemented sfdp_layout algorithm in graph-tools to generate final layout. sfdp stands for Scalable Force Directed Placement.
    - Minor caveat is that the repulsion is not as good - when there's a lot of singleton nodes, they don't separate well unless you some how work out which of the parameters in sfdp_layout to tweak will produce an effective separate. changing gamma alone doesn't really seem to do much.
    - The original layout can still be generated by specifying layout_method = 'mod_fr'. Requires a separate installation of graph-tool via conda (not managed by pip) as it has several C++ dependencies.
    - pytest on macos may also stall because of a different backend being called - this is solved by changing tests that calls generate_network to run last.
- added steps to reduce memory hogging.
- min_size was doing the opposite previously and this is now fixed. #155
Speed up transfer
- Found a faster way to create the connectivity matrix.
- this also now transfer a dictionary that scirpy can use to generate the plots https://github.com/scverse/scirpy/issues/286
- Fix #153
  - rename productive to productive_status.
Fix #154
- reorder the if-else statements.
Speed up filter_contigs
- tree construction is simplified and replaced for-loops with dictionary updates.
Speed up initialise_metadata. Dandelion should now initialise and read faster.
- Removed an unnecessary data sanitization step when loading data.
- Now load_data will rename umi_count to duplicate_count
- Speed up Query
  - tree construction is simplified and replaced for-loops with dictionary updates.
  - didn't need to use an airr validator as that slows things down.
data initialised by Dandelion will be ordered based on productive first, then followed by umi count (largest to smallest).

Breaking Changes

initialise_metadata/update_metadata/Dandelion
- For-loops to initialise the object has veen vectorized, resulting in a minor speed uprade
- This results in reduction of some columns in the .metadata which were probably bloated and not used.
  - vdj_status and vdj_status_summary removed and replaced with rearrangement_VDJ_status and rearrange_VJ_status
  - constant_status and constant_summary removed and replaced with constant_VDJ_status and constant_VJ_status.
  - productive and productive_summary combined and replaced with productive_status.
  - locus_status and locus_status_summary combined and replaced with locus_status.
  - isotype_summary replaced with isotype_status.
where there was previously unassigned or '' has been changed to :str: None in .metadata.
- Not changed to NoneType as there's quite a bit of text processing internally that gets messed up if swapped.
- No_contig will still be populated after transfer to AnnData to reflect cells with no TCR/BCR info.
deprecate use of nxviz<0.7.4
- reworked code to use the updated version at https://github.com/zktuong/nxviz/tree/custom_color_mapping_circos_nodes_and_edges

Minor changes

Rename and deprecate read_h5/write_h5. Use of read_h5ddl/write_h5ddl will be enforced in the next update.

Full Changelog: https://github.com/zktuong/dandelion/compare/v0.2.1...v0.2.2

dandelion - v0.2.1

Published by zktuong over 2 years ago

What's Changed

Update documentation by @zktuong in https://github.com/zktuong/dandelion/pull/142
Add google colab button by @zktuong in https://github.com/zktuong/dandelion/pull/143
Fix pytest issue on macos by @zktuong in https://github.com/zktuong/dandelion/pull/144
v0.2.1 by @zktuong in https://github.com/zktuong/dandelion/pull/147
- minor QoL changes
- remove skbio from dependencies

Full Changelog: https://github.com/zktuong/dandelion/compare/v0.2.0...v0.2.1

dandelion - 0.2.0

Published by zktuong over 2 years ago

What's Changed

add hiconf to pipeline by @ktpolanski in https://github.com/zktuong/dandelion/pull/130
Singularity ci by @zktuong in https://github.com/zktuong/dandelion/pull/131
v0.1.13 by @zktuong in https://github.com/zktuong/dandelion/pull/132
add db-all file by @zktuong in https://github.com/zktuong/dandelion/pull/134
fix bcr strict option by @zktuong in https://github.com/zktuong/dandelion/pull/136
clarify and add new detail by @ktpolanski in https://github.com/zktuong/dandelion/pull/137
fix container workflow by @zktuong in https://github.com/zktuong/dandelion/pull/138
Internal external preprocessing steps (igblastn, blastn) and Query classes are also simplified/sped up.
AIRR format sanitisation is also enforced to prevent issues during I/O.
Full distance slot is no longer automatically saved to reduce I/O times

Full Changelog: https://github.com/zktuong/dandelion/compare/v0.1.12...v0.2.0

dandelion - v0.1.12

Published by zktuong over 2 years ago

What's Changed

update flexibility of singularity container by @zktuong in https://github.com/zktuong/dandelion/pull/122
Update find clones by @zktuong in https://github.com/zktuong/dandelion/pull/124

Full Changelog: https://github.com/zktuong/dandelion/compare/v0.1.11...v0.1.12

dandelion - v0.1.11

Published by zktuong almost 3 years ago