Geographic Utilities๏
The geographic utilities package provides comprehensive tools for working with geographic data, including enhanced Census utilities, intelligent data selection, and spatial analysis capabilities.
Contents:
Tiered Installation๏
The geo package uses a three-tier extras system so you only install what you need:
# Tier 1: Lightweight (no GDAL/GEOS system deps)
pip install siege-utilities[geo-lite] # shapely, pyproj, geopy, censusgeocode
# Tier 2: Full geospatial (requires GDAL/GEOS/PROJ)
pip install siege-utilities[geo] # geo-lite + geopandas, fiona, rtree, tobler, osmnx
# Tier 3: GeoDjango spatial platform
pip install siege-utilities[geodjango] # geo + Django, DRF-GIS, psycopg2
Runtime capability detection:
from siege_utilities.geo.capabilities import geo_capabilities
caps = geo_capabilities()
# {'shapely': True, 'pyproj': True, 'geopandas': False, 'fiona': False, ...}
Functions that require full [geo] dependencies will raise ImportError with
installation instructions when called from a [geo-lite] environment. See
docs/MANAGED_ENVIRONMENTS.md for setup guides for Azure Databricks, Google Colab,
and AWS SageMaker.
Isochrones and CRS๏
The isochrone module supports ORS and Valhalla providers with configurable CRS,
retry logic, and domain-specific exceptions (IsochroneError,
IsochroneNetworkError, IsochroneProviderError).
from siege_utilities.geo.isochrones import get_isochrone
from siege_utilities.geo.crs import set_default_crs
# Set project-wide default CRS
set_default_crs("EPSG:2263") # NY State Plane
# Get a 15-minute walk isochrone (returned in your default CRS)
isochrone = get_isochrone(lat=40.7128, lon=-74.0060, minutes=15)
See docs/ISOCHRONES_AND_WKLS.md for full details on provider configuration,
timeout handling, and the crs parameter available on all spatial-returning functions.
Overview๏
The geographic utilities package offers a complete solution for geographic data analysis:
Enhanced Census Utilities: Dynamic discovery and download of Census TIGER/Line boundaries
Intelligent Data Selection: Automatic recommendation of the best Census datasets for your analysis needs
Spatial Data Processing: Comprehensive tools for working with geographic boundaries and spatial data
Geocoding Services: Address geocoding and reverse geocoding capabilities
Data Integration: Seamless integration with external data sources and analytics platforms
Key Features๏
Census Data Intelligence๏
The new Census Data Intelligence system makes Census data human-comprehensible by:
Automatic Dataset Selection: Intelligently recommends the best Census datasets based on your analysis type, geography level, and time requirements
Relationship Mapping: Maps relationships between different Census surveys (Decennial, ACS 1-year/5-year, Economic Census, Population Estimates)
Quality Guidance: Provides methodology notes, quality checks, and reporting considerations
Pitfall Prevention: Helps avoid common mistakes like using incompatible datasets or ignoring margins of error
Example Usage:
from siege_utilities.geo import select_census_datasets
# Get recommendations for demographic analysis at tract level
recommendations = select_census_datasets(
analysis_type="demographics",
geography_level="tract",
variables=["population", "income", "education"]
)
# System automatically recommends ACS 5-Year Estimates (2020)
# because it provides stable, detailed data at tract level
primary_dataset = recommendations["primary_recommendation"]["dataset"]
print(f"Use {primary_dataset} for your analysis")
Enhanced Census Utilities๏
Dynamic Discovery: Automatically discovers available Census years and boundary types
SSL Fallback: Robust handling of network issues with automatic fallback mechanisms
Comprehensive State Information: Complete FIPS codes, names, and abbreviations for all states
Multiple Geography Levels: Support for counties, tracts, block groups, and more
Parameter Validation: Robust validation of input parameters with helpful error messages
Spatial Data Processing๏
Format Conversion: Convert between GeoJSON, Shapefile, and other spatial formats
Coordinate System Transformation: Transform data between different coordinate reference systems
Database Integration: Connect to PostGIS and other spatial databases
Optional DuckDB Support: Lightweight spatial operations with optional DuckDB integration
Geocoding Services๏
Address Geocoding: Convert addresses to geographic coordinates
Reverse Geocoding: Convert coordinates to addresses
Batch Processing: Process multiple addresses efficiently
Multiple Providers: Support for various geocoding services
Installation๏
# Lightweight (no GDAL required)
pip install siege-utilities[geo-lite]
# Full geospatial
pip install siege-utilities[geo]
# Full + Django/PostGIS
pip install siege-utilities[geodjango]
# Development
pip install siege-utilities[geo,testing]
Quick Start๏
Get Census Data Intelligence:
from siege_utilities.geo import get_census_intelligence mapper, selector = get_census_intelligence() # Get dataset recommendations recommendations = selector.select_datasets_for_analysis( "demographics", "tract" )
Download Census Boundaries:
from siege_utilities.geo.spatial_data import census_source # Download county boundaries for California counties = census_source.get_geographic_boundaries( year=2020, geographic_level="county", state_fips="06" )
Use Intelligent Data Selection:
from siege_utilities.geo import quick_census_selection # Quick selection for business analysis result = quick_census_selection("business", "county") print(f"Use {result['recommendations']['primary_recommendation']['dataset']}")
Analysis Types Supported๏
The intelligent data selection system recognizes these analysis types:
demographics - Population, age, race, ethnicity, income, education
housing - Housing units, value, rent, tenure, vacancy
business - Business counts, employment, industry, payroll
transportation - Commute time, transportation mode, vehicle availability
education - Education level, school enrollment, field of study
health - Health insurance, disability status, veteran status
poverty - Poverty status, public assistance, income
Geography Levels Supported๏
nation - Country-level data
state - State-level data
county - County-level data
tract - Census tract (neighborhood-level)
block_group - Block group (sub-neighborhood)
block - Census block (smallest unit)
place - City/town data
zip_code - ZIP code areas
cbsa - Metropolitan areas
Census Survey Types๏
Decennial Census - Complete count every 10 years (highest reliability)
ACS 5-Year Estimates - 5-year rolling average (stable, detailed data)
ACS 1-Year Estimates - Single year estimates (recent, large areas only)
Economic Census - Business establishment counts every 5 years
Population Estimates - Annual estimates between decennial censuses
Data Quality and Reliability๏
HIGH - Decennial Census, Economic Census (100% counts)
MEDIUM - ACS 5-year estimates (sample-based with margins of error)
LOW - ACS 1-year estimates (higher margins of error)
ESTIMATED - Population estimates (modeled from administrative records)
Best Practices๏
Always check margins of error for ACS estimates
Use consistent survey types for comparisons
Consider geography limitations when selecting data
Validate data against known benchmarks
Document your data sources and methodology
Use the intelligent selector to avoid common pitfalls
Examples๏
Use the maintained example module:
siege_utilities/examples/enhanced_features_demo.py- examples of enhanced Census and package features
For detailed API documentation, see Enhanced Census Utilities and Spatial Data.